Media Cloud: Setting up Your Environment
========================================

[Media Cloud](https://mediacloud.org) is an open-source platform for media analysis. It is a collaborative academic project supported by various non-profit foundations since 2011. It is now led by a consortium of [University of Massachusetts Amherst](https://publicinfrastructure.org), [Northeastern University](http://dataculture.northeastern.edu), the [Media Ecosystems Analysis Group](https://www.mediaecosystems.org). You can use our various online tools to investigate news coverage about your topic of interest, and all the same information is available via a rich API.

This set of notebooks is a brief introduction to the API. It covers many of the most common operations we see researchers performing. The API is fully featured, so much so that [all our web-based tools](https://search.mediacloud.org/) are built on top of it.

Relevant references:
* Our [a Python client for the api on PyPi](https://pypi.org/project/mediacloud/)

## Setup Your API Key for this Tutorial

You need to instantiate a client with your **private** API key. This key is linked to your account, and has a quota attached to it so you don't blow up our servers. If you use up your quota then you will see errors returned in your API calls. It is reset automatically every week. Email us if you need to increase your quota.

To obtain your api key, you can:
1. [login to any of our tools](https://search.mediacloud.org/sign-in)
2. click the little person icon in the top right, then select "profile"
3. copy your API key from where it is shown in the list of information about your account

For this tutorial, we decided to include a `MC_API_KEY` python constant at the top of each notebook. Replace the placeholder value with your api key and then you're off.

## Installing and Instantiating a Client

All our web tools are built on top of our API. Most endpoints are publicly availabe, while others require administrative access. You can read a summary and see the low-level API documentation in our [back-end GitHub repository](https://github.com/berkmancenter/mediacloud/blob/master/doc/api_2_0_spec/api_2_0_spec.md).

In [2]:
# If you are running this locally (not on Binder), then you should install the requirements. If you are using this on
# Binder then all of these will be installed for you automatically.
import sys
!{sys.executable} -m pip install -r requirements.txt

Collecting mediacloud>=4.1.0 (from -r requirements.txt (line 1))
  Downloading mediacloud-4.1.3-py3-none-any.whl.metadata (4.3 kB)
Downloading mediacloud-4.1.3-py3-none-any.whl (23 kB)
Installing collected packages: mediacloud
Successfully installed mediacloud-4.1.3


In [6]:
import os, mediacloud.api
from importlib.metadata import version
# Set your personal API KEY
MC_API_KEY = 'b3dcc55226e52a962c59ed63e794bfbd7d855edc'
search_api = mediacloud.api.SearchApi(MC_API_KEY)
f'Using Media Cloud python client v{version("mediacloud")}'

'Using Media Cloud python client v4.1.3'

In [7]:
import datetime as dt
# make sure your connection and API key work by asking for the total count of in 2023
results = search_api.story_count('*', dt.date(2023,1,1), dt.date(2023,12,31))
results

{'relevant': 153764791, 'total': 153764791}

In [5]:
# or print it out as a nice json tree - we'll use this later (only works in Jupyter Lab)
from IPython.display import JSON
JSON(results)

<IPython.core.display.JSON object>