Make a New Media Collection
===========================

## Setup

The first time you run this, you need to run the below cell to install libraries this relies on

In [None]:
# Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install mediacloud ipywidgets

## Configuration

Paste in your Media Cloud API key, which you can get from the profile page in any of our web apps.

In [81]:
my_api_key = "YOUR_API_KEY"

# setup the library to make calls to our API with your key
import mediacloud.api
mc = mediacloud.api.AdminMediaCloud(my_api_key)

Now setup the details about your collection. Edit the stuff in quotes below.

In [82]:
collection_name = "My Awesome Collection3"  # what do you want users to see as the name for this collection?
collection_computer_name = "my_awesome_collection3"  # give us a "computery" version of the collection name too
collection_description = "Explain the purpose, and credit people who helped"
is_it_static = False # set this to "True" if this is a collection that is a one-time snapshot and won't change, "False" otherwise

Now we need to check if this is a new collection or not, based on the `collection_name` you picked. If it is an existing one, we will update the collectioon to match your spreadsheet.

In [88]:
COLLECTION_TAG_SET = 5  # the id of the collection of collections (yes, this is confusing)
similar_collections = mc.tagList(tag_sets_id=COLLECTION_TAG_SET, name_like=collection_name)
existing_collection_id = None
for c in similar_collections:
    if c['label'].lower() == collection_name.lower():
        existing_collection_id = c['tags_id']
        print("This collection already exists, with id {}. We will update it, and NOT create a new one".format(existing_collection_id))
if existing_collection_id is None:
    print("We will make a new collection for you.")

This collection already exists, with id 257344083. We will update it, and NOT create a new one


## Make the Thing!

In [89]:
# is it public?
show_on_stories = True
show_on_media = True

### Make/Find Media Sources

First we iterate over all the media sources to find them or create them as needed.

In [90]:
# read in the list of media
import csv
INPUT_FILE = 'collection.csv'
media_list = []
with open(INPUT_FILE) as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        media_list.append(row)
print("Loaded {} media from {}:".format(len(media_list), INPUT_FILE))
# This step check each one - it might take a while, but you'll see one row printed out below for each item,
# so at least you can track the progress.
import json
media_ids = []
for idx, m in enumerate(media_list):
    if len(m['media_id'])==0:
        params = m.copy()
        valid_params = ['url', 'name', 'foreign_rss_links', 'content_delay', 'feeds', 'tags_ids', 'editor_notes',
                        'public_notes', 'is_monitored', 'foreign_rss_links', 'is_monitored']
        invalid_params = [k for k in params if k not in valid_params]
        for p in invalid_params:
            del params[p]
        results = mc.mediaCreate([params])[0]
        if results['status'] == 'existing':
            print("  Row {}: {} - found existing source media_id {}".format(idx, m['url'], results['media_id']))
        else:
            print("  Row {}: {} - created new source media_id {}".format(idx, m['url'], results['media_id']))
        media_ids.append(results['media_id'])
    else:
        print("  Row {}: {} - using media_id {}".format(idx, m['url'], m['media_id']))
        media_ids.append(m['media_id'])

Loaded 3 media from collection.csv:
  Row 0: http://nytimes.com - using media_id 1
  Row 1: http://www.cbsnews.com/ - found existing source media_id 1752
  Row 2: http://www.cnn.com/ - found existing source media_id 1095


In [91]:
# now make the new collection, if needed
collection_id = None
if existing_collection_id is None:
    new_collection = mc.createTag(COLLECTION_TAG_SET, collection_computer_name, collection_name, collection_description,
                                  is_static=is_it_static, show_on_stories=show_on_stories, show_on_media=show_on_media)
    collection_id = new_collection['tag']['tags_id']
else:
    collection_id = existing_collection_id
print("We'll have a collection id - {}".format(collection_id))

We'll have a collection id - 257344083


In [92]:
import mediacloud.tags
# Now we tag all the media so they show up as part of the new collection
tags_to_create = [mediacloud.tags.MediaTag(mid, tags_id=collection_id, action=mediacloud.tags.TAG_ACTION_ADD) for mid in media_ids]
if len(tags_to_create) > 0:
    mc.tagMedia(tags_to_create)

Can't remove private info from something that isn't a dict
Can't remove private info from something that isn't a dict


In [93]:
print("Checkout the collection now, at https://sources.mediacloud.org/#collections/{}".format(collection_id))

Checkout the collection now, at https://sources.mediacloud.org/#collections/257344083
