Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds transcript block and meta #221

Merged
merged 45 commits into from Dec 4, 2023
Merged

Adds transcript block and meta #221

merged 45 commits into from Dec 4, 2023

Conversation

nateconley
Copy link
Contributor

@nateconley nateconley commented Feb 10, 2023

Description of the Change

Adds a block that saves the transcript to post meta and allows editor to either link to the transcript, display the transcript, or only have the transcript accessible via the endpoint.

Closes #28

TODO:

  • Add the endpoint for the transcript as a barebones HTML file
  • Add the endpoint to the podcast RSS feed per item: <podcast:transcript url="https://example.com/podcasts/podcast-name/episode-slug/transcript.html" type="text/html" language="en" />
  • Add endpoint to block link
  • Add tests

Alternate Designs

This approach is as a block, also discussed were this functionality a meta field.

Possible Drawbacks

This approach does not account for the classic editor.

Verification Process

  • Add a transcript from the podcast block sidebar

Checklist:

  • I have read the CONTRIBUTING document.
  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my change.
  • All new and existing tests passed.

Changelog Entry

Added - Podcast transcripts

Credits

@nateconley

@jeffpaul jeffpaul requested review from a team and peterwilsoncc and removed request for a team April 11, 2023 14:29
@jeffpaul jeffpaul requested a review from a team April 25, 2023 18:28
@nateconley nateconley marked this pull request as ready for review April 28, 2023 05:21
@nateconley
Copy link
Contributor Author

Hi @peterwilsoncc ! This is now ready for review. There is much more discussion on the approach to this in the original issue thread: #28

I added two unit tests, but my functions do not have full coverage. I'm happy to add more if given a bit of direction. I did not add any new E2E tests. Again, happy to add those in if you can tell me what coverage you'd like to see.

When building the new transcript block, I tried to write the block using a more updated approach to block-building without changing the original block methodology (except when I did make an addition to those files, the prettier config made some formatting edits).

One thing to note is that this update will require flushing rewrite rules. How do you usually like to approach that? Upgrade instructions? Upgrade routine?

@jeffpaul jeffpaul removed the request for review from a team May 8, 2023 14:36
Copy link
Contributor

@peterwilsoncc peterwilsoncc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few notes inline.

I'm not sure it is possible to flush the rewrite rules automatically upon updating a plugin. According to the docs, the upgrader_process_complete hook runs using the old version of the plugin, not the new.

It's a little ugly but you might be able to use an option simple_podcasting_db_version and flush rewrite rules if it's not set or less than two. If this is the approach taken, I'd recommend limiting the check so it only happens within the dashboard. The rules don't need to be flushed prior to someone visiting the admin and creating/editing a new podcast.

Other people may have better suggestions though, maybe ask around a bit wider.

includes/blocks.php Outdated Show resolved Hide resolved
$podcast_slug = get_query_var( 'podcasting-episode' );
$post_object = get_page_by_path( $podcast_slug, OBJECT, 'post' );
if ( $post_object instanceof WP_Post ) {
echo wp_kses_post(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can run the transcript through kses on save rather than render, and then treat it as trusted data on output. This is what WP Core does for post content, etc as kses is a relatively expensive function to run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterwilsoncc I've addressed this feedback in this commit: 9264fa4. Thanks

@@ -0,0 +1,31 @@
<?php
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to use the post.php template for the transcript when displaying it via a link. I don't really understand the comment Intentionally barebones with the minimum html for use by tools, why have you done that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterwilsoncc The intention here was to follow a spec defined by Podcastindex.org. Intentionally barebones because there may be players that are able to read the HTML. Using the post.php template makes sense for humans reading this, should there be two versions? Original comment here: #28 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention here was to follow a spec defined by Podcastindex.org

@nateconley Understood and makes sense.

Do you think it's worth noindexing the transcript page so it doesn't end up in search engines? To allow developers to filter the results for their site, the code could be:

if ( function_exists( 'wp_robots' ) && function_exists( 'wp_robots_no_robots' ) && function_exists( 'add_filter' ) ) {
	add_filter( 'wp_robots', 'wp_robots_no_robots' );
	wp_robots();
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great suggestion. This has been added.

includes/blocks/podcast-transcript/markup.php Outdated Show resolved Hide resolved
includes/blocks/podcast-transcript/markup.php Outdated Show resolved Hide resolved
includes/transcripts.php Outdated Show resolved Hide resolved
includes/transcripts.php Outdated Show resolved Hide resolved
includes/transcripts.php Outdated Show resolved Hide resolved
includes/transcripts.php Outdated Show resolved Hide resolved
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title><?php esc_html_e( 'Transcript', 'simple-podcasting' ); ?> - <?php echo esc_html( get_the_title() ); ?></title>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better for internationalization.

I've removed the escaping for get_the_title() as it's escaped on save and considered safe.

Suggested change
<title><?php esc_html_e( 'Transcript', 'simple-podcasting' ); ?> - <?php echo esc_html( get_the_title() ); ?></title>
<title><?php
printf(
esc_html__( 'Transcript - %s', 'simple-podcasting' ),
get_the_title()
);
?></title>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterwilsoncc, Most of the data is sanitised/escaped on save, and this is a very small concern, but in case an attacker would get hold of DB or is able to modify values of wp_posts, I think an XSS attack would be easy to do here, and I don't see much of an issue with escaping here(unless we're facing double escaping issues). So I'm in favour of keeping the escape, but let me know what you think about it.

@nateconley
Copy link
Contributor Author

@peterwilsoncc I left a comment on one of your concerns. I do not have time until at least two weeks from now to dig into this more.

@jeffpaul jeffpaul modified the milestones: 1.5.0, 1.6.0 Jun 28, 2023
Copy link
Contributor

@peterwilsoncc peterwilsoncc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few notes following some branch testing.

The taxonomy name constant was renamed so I pushed c8216c25f70ef48c7b0971576a2758c550e98768 to resolve that.

@@ -0,0 +1,31 @@
<?php
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention here was to follow a spec defined by Podcastindex.org

@nateconley Understood and makes sense.

Do you think it's worth noindexing the transcript page so it doesn't end up in search engines? To allow developers to filter the results for their site, the code could be:

if ( function_exists( 'wp_robots' ) && function_exists( 'wp_robots_no_robots' ) && function_exists( 'add_filter' ) ) {
	add_filter( 'wp_robots', 'wp_robots_no_robots' );
	wp_robots();
}

printf(
/* translators: %s: The page title */
esc_html__( 'Transcript - %s', 'simple-podcasting' ),
get_the_title() // phpcs:ignore WordPress.Security.EscapeOutput
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
get_the_title() // phpcs:ignore WordPress.Security.EscapeOutput
wp_strip_all_tags( get_the_title() )

WP allows HTML in titles and browsers just render the tags as is, so this avoids tabs that look like this

Screen Shot 2023-09-20 at 10 44 53 am

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.


registerFormatType('podcasting/transcript-time', {
title: __('Time', 'simple-podcasting'),
tagName: 'time',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you are using kses to sanitize, you'll need to add the timeelement to the list of allowed tags using the wp_kses_allowed_html filter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this filter.

foreach ( $body_node->childNodes as $node ) {
// phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( XML_TEXT_NODE === $node->nodeType ) {
$filtered_text .= '<p>' . $doc->saveHTML( $node ) . '</p>';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing some strange markup here when using the citation button.

The markup is rendered as

<cite>Person 1</cite>
<p>: Welcome to Show</p>
<br>
<cite>Person 2</cite>
<p>: I'm P2</p>
<br>
<cite>Person 1:</cite>
<p> and I'm P1.</p>
<br>
</body>

The markup is saved in post meta as:

<cite>Person 1</cite>: Welcome to Show<br>
<cite>Person 2</cite>: I'm P2<br>
<cite>Person 1:</cite> and I'm P1.<br>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to make RichText do too much. This should have been InnerBlocks, and has been changed. The added benefit is that this is much more user friendly now.

@nateconley
Copy link
Contributor Author

Quick update: I will have all comments addressed here by EOW.

@github-actions github-actions bot added the needs:code-review This requires code review. label Nov 19, 2023
@nateconley
Copy link
Contributor Author

@peterwilsoncc This is ready for a re-review.

@nateconley
Copy link
Contributor Author

I could also use a bit of help with the failing e2e tests. It looks like those might be caused by a missing dependency.

@peterwilsoncc
Copy link
Contributor

Sorry Nate, I'm still having problems while testing.

If I start a citation within a transcript block then the editor stays as a citataion after I press return

cite

Reviewing some documentation, it looks like having them as blocks rather than inline is a good improvement.

@nateconley
Copy link
Contributor Author

@peterwilsoncc Thanks for the catch! I have modified the cite and time inner blocks to behave similar to other core blocks; when you press return a paragraph block is created.

Again, looks like cypress test are failing, but this seems unrelated to this branch from what I can see.

Copy link
Contributor

@peterwilsoncc peterwilsoncc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nateconley, this looks good to me.

I pushed a fix for the failing E2E tests by downgrading mochawesome-json-to-md. For some reason version 1.x isn't working properly with our GitHub actions.

@jeffpaul jeffpaul merged commit e72657b into develop Dec 4, 2023
11 checks passed
@jeffpaul jeffpaul deleted the feature/transcript branch December 4, 2023 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs:code-review This requires code review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add transcript support for Level A Accessibility
9 participants