Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upImprove parsing of rss 2.0 feeds with yahoo media enclosures #731
Comments
jdragojevic
referenced this issue
Jul 19, 2013
Closed
Import: determine feed formats that we can get from Kaltura #728
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bendk
Aug 6, 2013
Member
One question for this one is should the feed data overwrite the normal title/descriptions we get. For example for youtube videos we scrape the title/description from youtube.
I'm going to make it so that it does overwrite it, but tell me if it shouldn't.
|
One question for this one is should the feed data overwrite the normal title/descriptions we get. For example for youtube videos we scrape the title/description from youtube. I'm going to make it so that it does overwrite it, but tell me if it shouldn't. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bendk
Aug 6, 2013
Member
How can we know that a video URL is from kaltura? I'm going to assume that if the hostname ends with kaltura.com, and we can scrape the ID correctly, then it's a kaltuara URL. Is that okay?
|
How can we know that a video URL is from kaltura? I'm going to assume that if the hostname ends with kaltura.com, and we can scrape the ID correctly, then it's a kaltuara URL. Is that okay? |
added a commit
that referenced
this issue
Aug 6, 2013
added a commit
that referenced
this issue
Aug 6, 2013
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bendk
Aug 6, 2013
Member
Couple of notes on the video Ids:
- We don't need to store this value in the DB because we can parse it from the URL
- We will only parse it from the URL if the VideoUrl object has type="K" for Kaltura. This should happen for all Kaltura URLs going forward, no matter how they get added to the system. But it won't work for Kaltura URLs that were already in the system before the changes. If we want to implement that we'll need a database migration.
- The parsing depends on us being able to identify kaltura URLs, hopefully the method I proposed above is good.
|
Couple of notes on the video Ids:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jdragojevic
Aug 7, 2013
Contributor
@bendk - I am not getting any title / descrption values on these videos when I import this feed.
http://www.kaltura.com/api_v3/getFeed.php?partnerId=1492321&feedId=0_py3x4ruz
|
@bendk - I am not getting any title / descrption values on these videos when I import this feed. http://www.kaltura.com/api_v3/getFeed.php?partnerId=1492321&feedId=0_py3x4ruz |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jdragojevic
Aug 7, 2013
Contributor
@bendk - I have three test failures on the tests for check_api_v2.test_video_url_resource - they all pass against the latest version of the staging branch.
It could be with how I am creating the videos. Using a factory, and then using
video_url = self.test_video.get_video_url() - to get the url stored for the video which is coming back as None.
Could be I need to update something in the tests, but I'm not sure what would be correct.
======================================================================
FAIL: Verify video urls for a particular video are listed.
----------------------------------------------------------------------
Traceback (most recent call last):
File "apps/webdriver_testing/check_api_v2/test_video_url_resource.py", line 38, in test_list
self.assertEqual(video_url, response['objects'][0]['url'])
AssertionError: None != u'http://unisubs.example.com/0.mp4'
-------------------- >> begin captured logging << --------------------
test_steps: INFO: Video: DEmsCHxz9zsQ
test_steps: INFO: testcase: webdriver_testing.check_api_v2.test_video_url_resource.TestCaseVideoUrl.test_list
test_steps: INFO: description: Verify video urls for a particular video are listed.
--------------------- >> end captured logging << ---------------------
======================================================================
FAIL: Add an additional new url.
----------------------------------------------------------------------
Traceback (most recent call last):
File "apps/webdriver_testing/check_api_v2/test_video_url_resource.py", line 57, in test_url__post
self.assertIn(video_url, response['all_urls'], response)
AssertionError: None not found in [u'http://unisubs.example.com/0.mp4', u'http://unisubs.example.com/newurl.mp4'] : {u'description': u'Greatest Video ever made', u'all_urls': [u'http://unisubs.example.com/0.mp4', u'http://unisubs.example.com/newurl.mp4'], u'created': u'2013-08-07T05:52:39.854527', u'title': u'Test Video 0', u'site_url': u'http://unisubs.example.com:9000/videos/DEmsCHxz9zsQ/info/', u'languages': [], u'thumbnail': u'', u'resource_uri': u'/api2/partners/videos/DEmsCHxz9zsQ/', u'team': None, u'duration': None, u'original_language': None, u'id': u'DEmsCHxz9zsQ', u'metadata': {}}
-------------------- >> begin captured logging << --------------------
test_steps: INFO: testcase: webdriver_testing.check_api_v2.test_video_url_resource.TestCaseVideoUrl.test_url__post
test_steps: INFO: description: Add an additional new url.
--------------------- >> end captured logging << ---------------------
======================================================================
FAIL: Verify video urls for a particular video are listed.
----------------------------------------------------------------------
Traceback (most recent call last):
File "apps/webdriver_testing/check_api_v2/test_video_url_resource.py", line 103, in test_url__put_primary
self.assertEqual('http://unisubs.example.com/newerurl.mp4', self.test_video.get_video_url())
AssertionError: 'http://unisubs.example.com/newerurl.mp4' != None
-------------------- >> begin captured logging << --------------------
test_steps: INFO: testcase: webdriver_testing.check_api_v2.test_video_url_resource.TestCaseVideoUrl.test_url__put_primary
test_steps: INFO: description: Verify video urls for a particular video are listed.
--------------------- >> end captured logging << ---------------------
|
@bendk - I have three test failures on the tests for check_api_v2.test_video_url_resource - they all pass against the latest version of the staging branch. It could be with how I am creating the videos. Using a factory, and then using video_url = self.test_video.get_video_url() - to get the url stored for the video which is coming back as None. Could be I need to update something in the tests, but I'm not sure what would be correct.
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jdragojevic
Aug 7, 2013
Contributor
Here is a 1 video feed for testing: http://www.kaltura.com/api_v3/getFeed.php?partnerId=1492321&feedId=0_o84bng6j
|
Here is a 1 video feed for testing: http://www.kaltura.com/api_v3/getFeed.php?partnerId=1492321&feedId=0_o84bng6j |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jdragojevic
Aug 7, 2013
Contributor
btw - in the kaltura feed - you are forced to enter a base url value that's used as the link, ex:
<link>http://qa.amara.org/1_zlgl6ut8</link>
but it's not a valid value and we shouldn't use it, but instead the media:content url
|
btw - in the kaltura feed - you are forced to enter a base url value that's used as the link, ex:
but it's not a valid value and we shouldn't use it, but instead the media:content url |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
I just pushed a fix for all of the above issues. |
jdragojevic
referenced this issue
Aug 8, 2013
Closed
Improve parsing of rss 2.0 feeds with itunes enclosures #732
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jdragojevic
Aug 8, 2013
Contributor
- Verified that the selenium tests are passing now. Ran tests for check_api_v2, check_create_page, check_videos, check_teams...
- Checked on gh-31 demo that we can add itunes and yahoo rss feeds to the site and they are displayed with the title / description metadata. Agree with comment that we should use the values in the feed vs ones we can scrape
- Checked locally that we can add the feed to teams and the title / description values are stored for the videos.
- Commented in #837 that we will need that fixed relatively soon so as the teams that use the Kaltura integration will be regularly updating their feeds and they will need to be associated with the team.
|
jdragojevic commentedJul 18, 2013
An item in a Kaltura generated rss feed looks something like this:
We need to improve our parsing of these feeds so that we set the Video title, description and thumbnail when importing videos to teams.
Additionally the content:url includes the Kaltura unique id - we may want to grab this value and store it for future syncing of subs.
ex. in http://....entryId/1_zr7niumr/ - 1_zr7niumr is the piece we may want to store as a unique id.