Media Cloud: Setting up Your Environment
========================================

[Media Cloud](https://mediacloud.org) is an open-source platform for media analysis. It is a collaborative academic project supported by various non-profit foundations since 2011. You can use our various online tools to investigate news coverage about your topic of interest, and all the same information is available via a rich API.

This set of notebooks is a brief introduction to the API. It covers many of the most common operations we see researchers performing. The API is fully featured, so much so that [all our web-based tools](https://tools.mediacloud.org/#/home) are built on top of it.

Relevant references:
* Our [a Python client for the api on PyPi](https://pypi.org/project/mediacloud/)
* The [general Media Cloud API specification](https://github.com/berkmancenter/mediacloud/blob/master/doc/api_2_0_spec/api_2_0_spec.md)
* The [topic-mapper specific Media Cloud API Specification](https://github.com/berkmancenter/mediacloud/blob/master/doc/api_2_0_spec/topics_api_2_0_spec.md)

## Setup Your API Key for this Tutorial

You need to instantiate a client with your **private** API key. This key is linked to your account, and has a quota attached to it so you don't blow up our servers. If you run into the quota then you will see errors returned in your API calls. Email us if you need to increase your quota.

To obtain your api key, you can:
1. [login to any of our tools](https://tools.mediacloud.org/)
2. click the little person icon in the top right, then select "profile"
3. copy your API key from where it is shown in the list of information about your account

For this tutorial, we decided to use the commonly used [`python-dotenv`](https://pypi.org/project/python-dotenv/) library to load this magic string in each notebook file without exposing it.

1. In this Jupyter Lab hosted on Binder, select File -> Open from Path from the menu bar
2. Type in ".env" and click Open
3. Replace "MY_MC_API_KEY" with your API key (from your profile page)
4. Select File -> Save

## Installing and Instantiating a Client

All our web tools are built on top of our API. Most endpoints are publicly availabe, while others require administrative access. You can read a summary and see the low-level API documentation in our [back-end GitHub repository](https://github.com/berkmancenter/mediacloud/blob/master/doc/api_2_0_spec/api_2_0_spec.md).

In [3]:
# If you are running this locally (not on Binder), then you should install the requirements. If you are using this on
# Binder then all of these will be installed for you automatically.
#import sys
#!{sys.executable} -m pip install -r requirements.txt
!pip install requests==2.20.0
!pip install pymongo
!pip install pylint
!pip install python-dotenv
!pip install twine
!pip install wheel
!pip install keyring
!pip install mediacloud

Collecting requests==2.20.0
  Downloading requests-2.20.0-py2.py3-none-any.whl (60 kB)
Collecting idna<2.8,>=2.5
  Downloading idna-2.7-py2.py3-none-any.whl (58 kB)
Collecting urllib3<1.25,>=1.21.1
  Downloading urllib3-1.24.3-py2.py3-none-any.whl (118 kB)
Installing collected packages: idna, urllib3, requests
  Attempting uninstall: idna
    Found existing installation: idna 3.2
    Uninstalling idna-3.2:


ERROR: Could not install packages due to an EnvironmentError: [WinError 5] Access is denied: 'c:\\programdata\\anaconda3\\lib\\site-packages\\idna-3.2.dist-info\\INSTALLER'
Consider using the `--user` option or check the permissions.



Collecting pymongo
  Downloading pymongo-4.0-cp38-cp38-win_amd64.whl (354 kB)
Installing collected packages: pymongo
Successfully installed pymongo-4.0
Collecting python-dotenv
  Downloading python_dotenv-0.19.2-py2.py3-none-any.whl (17 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-0.19.2


In [4]:
from dotenv import load_dotenv
load_dotenv()  # load config from .env file

True

In [8]:
import os, mediacloud.api
# Read your personal API key from that .env file 
my_mc_api_key = ''
# A convention we use is to name your api client `mc`
mc = mediacloud.api.MediaCloud(my_mc_api_key)
mediacloud.__version__

'3.12.5'

In [9]:
# make sure your connection and API key work by asking for the high-level system statistics
mc.stats()

{'active_crawled_feeds': 163577,
 'active_crawled_media': 58445,
 'daily_downloads': 1079743,
 'daily_stories': 680552,
 'mediacloud_stats_id': 731,
 'stats_date': '2021-11-28',
 'total_downloads': 0,
 'total_sentences': 0,
 'total_stories': 1980295680}

In [10]:
# or print it out as a nice json tree - we'll use this later (only works in Jupyter Lab)
from IPython.display import JSON
JSON(mc.stats())

<IPython.core.display.JSON object>