Skip to content

Bug 3448 Make youtube video URLs visible to non-flash users. One edited ... #338

Closed
wants to merge 4 commits into from

3 participants

@deborahgu

...test, config changes to allow youtube apis.

  • If this goes prod memcache should probably be cleared for this memkey.
  • I don't have an eye for style and wasn't sure if this link should be styled at all.
  • Should the api call be a schwartz worker to make it async? It does only get called once per created embed.
@afuna
Dreamwidth Studios member
afuna commented Apr 19, 2013

I've been thinking about the call to a worker -- btw gearman would work better here than theschwartz. I think I'd prefer it being async, though, because even though it's created only once per embed, if you do it synchronously, you'd have to wait for it to finish loading before you could show the "success" page. Whereas if it was async, you could show the success page after we're done processing things locally, fire something off...

@afuna afuna commented on an outdated diff Apr 19, 2013
cgi-bin/LJ/EmbedModule.pm
+ my $ua = LJ::get_useragent( role => 'vimeo', timeout => 60 );
+ my $api_url = "http://vimeo.com/api/v2/video/"
+ . $vid_id
+ . ".json";
+
+ # Pass request to the user agent and get a response back
+ my $request = HTTP::Request->new(GET => $api_url);
+ my $res = $ua->request($request);
+
+ # Check the outcome of the response
+ if ($res->is_success) {
+ my $obj = from_json( $res->content );
+ $linktext = '"'
+ . ${$obj}[0]{'title'}
+ . '" '
+ . BML::ml('embedmedia.vimeo')
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

use LJ::Lang::ml here please! BML::ml should only be used on bml pages (and hopefully not even there, gone in a general phaseout...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna commented on an outdated diff Apr 19, 2013
cgi-bin/LJ/EmbedModule.pm
+ . ".json";
+
+ # Pass request to the user agent and get a response back
+ my $request = HTTP::Request->new(GET => $api_url);
+ my $res = $ua->request($request);
+
+ # Check the outcome of the response
+ if ($res->is_success) {
+ my $obj = from_json( $res->content );
+ $linktext = '"'
+ . ${$obj}[0]{'title'}
+ . '" '
+ . BML::ml('embedmedia.vimeo')
+ . ")";
+ } else {
+ warn "error getting video info from Vimeo: ", $res->status_line, "\n";
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

cough. warn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna commented on an outdated diff Apr 19, 2013
cgi-bin/LJ/EmbedModule.pm
@@ -257,6 +267,101 @@ sub _extract_num_unit {
return ( $num, "%" );
}
+# extract src info if useful
+# returns a hash of link text, url
+# currently handles: YouTube, Vimeo
+sub extract_src_info {
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

I do think that given how well this is separated out, it should be fairly straightforward to convert to be used by a gearman worker, and just fire off a job when the embed is created/edited!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna commented on the diff Apr 19, 2013
etc/config.pl
@@ -71,9 +71,9 @@
# );
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

Config snuck in.

BTW, I'd suggest putting configs into ext/local, following the instructions here:

http://wiki.dwscoalition.org/wiki/index.php/Dreamwidth_Scratch_Installation#Editing_the_config_files

so you don't have to worry about configs in the future.

@deborahgu
deborahgu added a note Apr 23, 2013

Grrr. Yeah, I did that, and for some reason it stopped working until I added it 2 places. I meant to debug but just put it in the etc dir as a stopgap and forgot about it. sigh.

@afuna
Dreamwidth Studios member
afuna added a note Apr 23, 2013

Hmm. Added to which two places? Anyway would be happy to help you poke at it if you want, or just cheer you on otherwise if you're poking it on your onw!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna commented on the diff Apr 19, 2013
doc/config-private.pl.txt
@@ -155,6 +155,10 @@
# email => ,
#);
+ #%YOUTUBE =(
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

Hmm. what's the difference between this, and the one in config-local.pl.txt? do we need both or just one?

@deborahgu
deborahgu added a note Jun 8, 2013

The one in private has the API key, the one in local has the URL. (for the record, when it gets pushed live we will need a Google API key. We can use the one that I registered, or somebody with a site administrative address can register more officially.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna commented on an outdated diff Apr 19, 2013
bin/upgrading/update-db-general.pl
@@ -4143,6 +4147,21 @@
"ALTER TABLE externalaccount " .
"ADD COLUMN active enum('1', '0') NOT NULL default '1'" );
}
+
+ # Needed to cache embed titles to minimize external API calls
+ if ( column_type( "embedcontent", "title" ) eq '' ) {
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

I think an error here -- "title" -- but there's no title column. Maybe you meant linktext? (same for the second table below too)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna commented on an outdated diff Apr 19, 2013
cgi-bin/LJ/EmbedModule.pm
## where new items overwrites old ones
my $table_name = ($preview) ? 'embedcontent_preview' : 'embedcontent';
- $journal->do("REPLACE INTO $table_name (userid, moduleid, content) VALUES ".
- "(?, ?, ?)", undef, $journal->userid, $id, $cmptext);
+ $journal->do( "REPLACE INTO $table_name " .
+ "(userid, moduleid, content, linktext, url) " .
+ "VALUES (?, ?, ?, ?, ?)",
+ undef, $journal->userid, $id, $cmptext, $src_info->{'linktext'}, $src_info->{'url'} );
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

(style) don't need the quotes around the hash key here (same in a bunch of places too)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna commented on the diff Apr 19, 2013
cgi-bin/LJ/EmbedModule.pm
die $journal->errstr if $journal->err;
# save in memcache
my $memkey = $class->memkey($journal->userid, $id, $preview);
- LJ::MemCache::set($memkey, $contents);
+ my $cref = { content => $cmptext,
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

hmm, I'd suggest changing the memkey from e.g., embedcontpreviwe to embedcontpreview2, etc, because it's otherwise not easy to dump just a subset of memcache.

@deborahgu
deborahgu added a note Jun 8, 2013

Thanks. That's easier than clearing memcache, which is what I thought would be the best solution. :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna commented on the diff Apr 19, 2013
cgi-bin/LJ/Web.pm
@@ -3829,6 +3833,7 @@ sub placeholder_link {
<a href="$link">
<img src="$img" class="LJ_Placeholder" title="Click to show embedded content" />
</a>
+ $direct_link
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

I like this. I think I'd have forgotten to update here...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna and 1 other commented on an outdated diff Apr 19, 2013
cgi-bin/LJ/EmbedModule.pm
@@ -257,6 +267,101 @@ sub _extract_num_unit {
return ( $num, "%" );
}
+# extract src info if useful
+# returns a hash of link text, url
+# currently handles: YouTube, Vimeo
+sub extract_src_info {
+ my ($class, $content) = @_;
+ my ($url, $site, $href, $linktext);
+
+ if ( $content =~ /src="http:\/\/.*youtube\.com/ ) {
+ # YouTube
+
+ my $host = "https://www.youtube.com/";
+ my $prefix = "watch?v=";
+
+ # get the video ID
+ $content =~ /.*src="[^"]*embed\/([^"]*)".*/;
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

Hmm. Why does the first regex contain youtube.com, whereas the second one doesn't?

@deborahgu
deborahgu added a note Apr 23, 2013

So this was a little handwavy, because the url styles can change at any time. But the idea was if it's a youtube embed, your regexp has already selected for that, so now we're just selecting for the video string in the attribute of the style src="http://www.youtube.com/embed/MfstYSUscBc"

I could put it there, too -- would make the regexp slightly more efficient. My thought proces was that if YT changed the embed address (say, to embed.youtu.be, for example) that would leave us fewer places to change the string. I'm not married to either way.

@afuna
Dreamwidth Studios member
afuna added a note Apr 24, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna commented on an outdated diff Apr 19, 2013
cgi-bin/LJ/EmbedModule.pm
+ . "&key="
+ . $apikey
+ . "&part=snippet";
+ # Pass request to the user agent and get a response back
+ my $request = HTTP::Request->new(GET => $queryurl);
+ my $res = $ua->request($request);
+
+ # Check the outcome of the response
+ if ($res->is_success) {
+ my $obj = from_json( $res->content );
+ $linktext = '"'
+ . ${$obj}{'items'}[0]{'snippet'}{'title'}
+ . '" ('
+ . BML::ml('embedmedia.youtube')
+ . ")";
+ } else {
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

warn here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna commented on the diff Apr 19, 2013
cgi-bin/LJ/EmbedModule.pm
@@ -366,12 +474,14 @@ sub module_iframe_tag {
# append a random string to the name so it can't be targetted by links
my $id = "embed_${journalid}_$moduleid";
my $name = "${id}_" . LJ::make_auth_code( 5 );
-
+ my $direct_link = defined $url
+ ? '<div><a href="' . $url . '">' . $linktext . '</a></div>' : '';
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

Since linktext / URL are from outside sources, we should make sure to ehtml these to make sure that they don't cause issues. This needs to be done any time we use the linktext/url on a page.

@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Important) Yes please, add the ehtml call.

@deborahgu
deborahgu added a note Jun 8, 2013

yep, sanitizing, headdesk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna afuna commented on the diff Apr 19, 2013
cgi-bin/LJ/Web.pm
my $img = delete $opts{img} || "$LJ::IMGPREFIX/videoplaceholder.png";
+ my $direct_link = defined $url
+ ? '<div class="embed_link"><a href="' . $url . '">' . $linktext . '</a></div>' : '';
@afuna
Dreamwidth Studios member
afuna added a note Apr 19, 2013

(same concern here, re: escaping)

@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Important) Yup, escaping.

@deborahgu
deborahgu added a note Jun 8, 2013

I went ahead and did this at define-time, rather than at use, so it's harder to miss if a new use turns up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@afuna
Dreamwidth Studios member
afuna commented Apr 19, 2013

Phew, way overdue for which I apologize! I got caught up in the to-worker-or-not question -- but that just took way too long. So sorry!

@zorkian zorkian and 1 other commented on an outdated diff Jun 8, 2013
cgi-bin/DW/Worker/EmbedWorker.pm
+ if keys %$arg;
+ return $job->permanent_failure("Missing argument")
+ unless defined $contents && defined $journalid;
+
+ my $result = LJ::EmbedModule->contact_external_sites({
+ vid_id => $vid_id,
+ host => $host,
+ preview => $preview,
+ contents => $contents,
+ cmptext => $cmptext,
+ journalid => $journalid,
+ id => $id,
+ linktext => $linktext,
+ url => $url,
+ });
+ my $sclient = LJ::theschwartz();
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

Why do you get this client?

@deborahgu
deborahgu added a note Jun 8, 2013

Because I didn't delete a line of defunct code. :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@zorkian zorkian commented on an outdated diff Jun 8, 2013
cgi-bin/LJ/EmbedModule.pm
@@ -49,8 +52,7 @@ sub save_module {
my $need_new_id = !defined $id;
if (defined $id) {
- my $old_content = $class->module_content( moduleid => $id,
- journalid => LJ::want_userid($journal) ) || '';
+ my $old_content = $class->module_content( moduleid => $id, journalid => LJ::want_userid($journal))->{content} || '';
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Style) Why did you break the multi-line? We prefer breaking long lines. (Scrolling is a bummer!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@zorkian zorkian commented on an outdated diff Jun 8, 2013
t/clean-embed.t
@@ -474,7 +474,7 @@ note( "Testing parse_embed (We parse the embed contents first from a post)" );
# check the iframe contents
# LJ::EmbedModule takes the content and cleans it
- my $got_embed = LJ::EmbedModule->module_content( journalid => $u->userid, moduleid => 1 );
+ my $got_embed = LJ::EmbedModule->module_content( journalid => $u->userid, moduleid => 1 )->{'content'};
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Style) Don't quote hash keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@zorkian zorkian commented on an outdated diff Jun 8, 2013
htdocs/tools/embedcontent.bml
@@ -39,7 +39,7 @@ _c?>
journalid => $journalid,
moduleid => $moduleid,
preview => $preview,
- );
+ )->{'content'};
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Style) Don't quote hash keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@zorkian zorkian commented on an outdated diff Jun 8, 2013
cgi-bin/LJ/EmbedModule.pm
+
+ if ( $contents =~ /src="http:\/\/.*youtube\.com/ ) {
+ # YouTube
+
+ my $host = "https://www.youtube.com/";
+ my $prefix = "watch?v=";
+
+ # construct the URL and link text
+ $contents =~ /.*src="[^"]*embed\/([^"]*)".*/;
+ my $vid_id = $1;
+ $url = $host . $prefix . $vid_id;
+ $linktext = LJ::Lang::ml('embedmedia.youtube');
+
+ # Fire off the worker to get the correct title
+ my $sclient = LJ::theschwartz();
+ die "Can't get TheSchwartz client" unless $sclient;
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Style) Marginally cleaner to say:

my $sclient = LJ::theschwartz()
    or croak "Can't get TheSchwartz client";
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@zorkian zorkian commented on an outdated diff Jun 8, 2013
cgi-bin/LJ/EmbedModule.pm
+ my $sclient = LJ::theschwartz();
+ die "Can't get TheSchwartz client" unless $sclient;
+ my $job = TheSchwartz::Job->new_from_array("DW::Worker::EmbedWorker",
+ { vid_id => $vid_id,
+ host => 'youtube',
+ preview => $preview,
+ contents => $contents,
+ cmptext => $cmptext,
+ journalid => $journal->id,
+ preview => $preview,
+ id => $id,
+ linktext => $linktext,
+ url => $url,
+ });
+ die "Can't create job" unless $job;
+ $sclient->insert($job) or die ("Can't queue youtube api job: $@");
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Style) Don't use parenthesis on this argument. Also, prefer croak.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@zorkian zorkian commented on an outdated diff Jun 8, 2013
cgi-bin/LJ/EmbedModule.pm
+ preview => $preview,
+ contents => $contents,
+ cmptext => $cmptext,
+ journalid => $journal->id,
+ preview => $preview,
+ id => $id,
+ linktext => $linktext,
+ url => $url,
+ });
+ die "Can't create job" unless $job;
+ $sclient->insert($job) or die ("Can't queue youtube api job: $@");
+
+ } elsif ( $contents =~ /src="http:\/\/.*vimeo\.com/ ) {
+ # Vimeo's default c/p embed code contains a link to the
+ # video by title. If that's present, don't build a link.
+ warn "$contents";
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Important) Debugging crept in!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@zorkian zorkian commented on an outdated diff Jun 8, 2013
cgi-bin/LJ/EmbedModule.pm
+ # video by title. If that's present, don't build a link.
+ warn "$contents";
+ my $host = "http://vimeo.com/";
+
+ # get the video ID
+ $contents =~ /.*src="[^"]*vimeo\.com\/video\/([^"]*)".*/;
+ my $vid_id = $1;
+
+ $url = $host . $vid_id;
+
+ # error getting video info from Vimeo
+ $linktext = LJ::Lang::ml('embedmedia.vimeo');
+
+ # Fire off the worker to get the correct title
+ my $sclient = LJ::theschwartz();
+ die "Can't get TheSchwartz client" unless $sclient;
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Style) As above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@zorkian zorkian commented on an outdated diff Jun 8, 2013
cgi-bin/LJ/EmbedModule.pm
+ my $sclient = LJ::theschwartz();
+ die "Can't get TheSchwartz client" unless $sclient;
+ my $job = TheSchwartz::Job->new_from_array("DW::Worker::EmbedWorker",
+ { vid_id => $vid_id,
+ host => 'vimeo',
+ preview => $preview,
+ contents => $contents,
+ cmptext => $cmptext,
+ journalid => $journal->id,
+ preview => $preview,
+ id => $id,
+ linktext => $linktext,
+ url => $url,
+ });
+ die "Can't create job" unless $job;
+ $sclient->insert($job) or die ("Can't queue vimeo api job: $@");
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Style) As above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@zorkian zorkian commented on an outdated diff Jun 8, 2013
cgi-bin/LJ/EmbedModule.pm
+ die "Can't create job" unless $job;
+ $sclient->insert($job) or die ("Can't queue youtube api job: $@");
+
+ } elsif ( $contents =~ /src="http:\/\/.*vimeo\.com/ ) {
+ # Vimeo's default c/p embed code contains a link to the
+ # video by title. If that's present, don't build a link.
+ warn "$contents";
+ my $host = "http://vimeo.com/";
+
+ # get the video ID
+ $contents =~ /.*src="[^"]*vimeo\.com\/video\/([^"]*)".*/;
+ my $vid_id = $1;
+
+ $url = $host . $vid_id;
+
+ # error getting video info from Vimeo
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Trivial) This comment seems incorrect?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@zorkian zorkian commented on an outdated diff Jun 8, 2013
cgi-bin/LJ/EmbedModule.pm
+ . ${$obj}[0]{title}
+ . '" ('
+ . LJ::Lang::ml('embedmedia.vimeo')
+ . ")";
+ } else {
+ # error getting video info from Vimeo
+ $linktext = LJ::Lang::ml('embedmedia.vimeo');
+ }
+ } else {
+ # Not one of our known embed types
+ return 'fail';
+ }
+
+ ## embeds for journal entry pre-post preview are stored in a special table,
+ ## where new items overwrites old ones
+ my $table_name = ($preview) ? 'embedcontent_preview' : 'embedcontent';
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Style) Don't need the parenthesis around the first bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@zorkian zorkian commented on an outdated diff Jun 8, 2013
cgi-bin/LJ/EmbedModule.pm
+ . '" ('
+ . LJ::Lang::ml('embedmedia.vimeo')
+ . ")";
+ } else {
+ # error getting video info from Vimeo
+ $linktext = LJ::Lang::ml('embedmedia.vimeo');
+ }
+ } else {
+ # Not one of our known embed types
+ return 'fail';
+ }
+
+ ## embeds for journal entry pre-post preview are stored in a special table,
+ ## where new items overwrites old ones
+ my $table_name = ($preview) ? 'embedcontent_preview' : 'embedcontent';
+ $journal->do( "REPLACE INTO $table_name " .
@zorkian
Dreamwidth Studios member
zorkian added a note Jun 8, 2013

(Style) I find all the quotes, dots, etc to be distracting and easy to mess up. Lately I've been doing this:

$journal->do(
    qq{REPLACE INTO $table_name
           (userid, moduleid, content, linktext, url)
       VALUES (?, ?, ?, ?, ?)},
    undef, $journal->userid, $id, $cmptext, $linktext, $url
);

It's just a style thing, but it looks cleaner to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@deborahgu

two reminders if this commit is ready to go: it needs an API key, and it needs workers.conf to be edited.

@zorkian
Dreamwidth Studios member
zorkian commented Jun 10, 2013

There were some new warn "FOO" lines added, so I took the liberty of cleaning it up. I pulled down the branch, flattened the commits into one, removed the warnings, trimmed trailing whitespace, and removed the changes to etc/config.pl. Functionally, I changed nothing.

7e6c7ca

I recommend you investigate settings for your text editor of choice to either highlight or auto-strip trailing whitespace. I find it really useful to be able to see it but not auto-strip it, personally, to make sure I don't commit any.

Thank you so much for the patch and all of your patience with this process! :)

@zorkian zorkian closed this Jun 10, 2013
@deborahgu deborahgu deleted the deborahgu:3448 branch Feb 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.