MediaCloud Python API Client
This is a python client for accessing the MediaCloud API v2. We support Python versions 2.7 and 3.6.
First sign up for an API key. Then
pip install mediacloud
CHANGELOG.md for a detailed history of changes.
Find out how many stories in the top US online news sites mentioned "Zimbabwe" in the last year:
import mediacloud mc = mediacloud.api.MediaCloud('MY_API_KEY') res = mc.storyCount('zimbabwe AND president AND tags_id_media:58722749', 'publish_date:[NOW-1YEAR TO NOW]') print(res['count']) # prints the number of stories found
Get 2000 stories from the NYT about a topic in 2018 and dump the output to json:
import mediacloud, json, datetime mc = mediacloud.api.MediaCloud('MY_API_KEY') fetch_size = 500 stories =  last_processed_stories_id = 0 while len(stories) < 2000: fetched_stories = mc.storyList('trump AND "north korea" AND media_id:1', solr_filter=mc.publish_date_query(datetime.date(2018,1,1), datetime.date(2019,1,1)), last_processed_stories_id=last_processed_stories_id, rows= fetch_size) stories.extend(fetched_stories) if len( fetched_stories) < fetch_size: break last_processed_stories_id = stories[-1]['processed_stories_id'] print(json.dumps(stories))
Find the most commonly used words in stories from the US top online news sites that mentioned "Zimbabwe" and "president" in 2013:
import mediacloud, datetime mc = mediacloud.api.MediaCloud('MY_API_KEY') words = mc.wordCount('zimbabwe AND president AND tags_id_media:58722749', mc.publish_date_query( datetime.date( 2013, 1, 1), datetime.date( 2014, 1, 1))) print(words) # prints the most common word
To find out all the details about one particular story by id:
import mediacloud mc = mediacloud.api.MediaCloud('MY_API_KEY') story = mc.story(169440976) print(story['url']) # prints the url the story came from
To save the first 100 stories from one day to a database:
import mediacloud, datetime mc = mediacloud.api.MediaCloud('MY_API_KEY') db = mediacloud.storage.MongoStoryDatabase('one_day') stories = mc.storyList('*', mc.publish_date_query( datetime.date (2014, 01, 01), datetime.date(2014,01,02) ), last_processed_stories_id=0,rows=100) [db.addStory(s) for s in stories] print(db.storyCount())
Take a look at the test in the
mediacloud/test/ module for more detailed examples.
If you are interested in adding code to this module, first clone the GitHub repository.
You need to create an
MC_API_KEY envvar and set it to your API key (we use
make test. We run continuous integration (via Travis),
so every push runs the whole test suite (we also do this nightly and on PRs).
Distributing a New Version
If you want to, setup twin's keyring integration to avoid typing your PyPI password over and over.
make testto make sure all the test pass
- Update the version number in
- Make a brief note in the version history section in the README file about the changes
make build-releaseto create an install package
make release-testtop upload it to PyPI's test platform
make releaseto upload it to PyPI