# Working with data in Python

## Overview

* The purpose of this notebook is to examine some of the common formats (or data structures) you can come across when working with data in Python. It focuses on non-tabular data which we will cover in another notebook and using the `pandas` module.


* To two core structures built in to Python that it is essential to be comfortable working with are:
    1. __LISTS__
    2. __DICTIONARIES__


* And two basic flow structures in Python code are:
    1. __LOOPS__
    2. __CONDITIONS__
    
    
    
* __JSON__ (JavaScript Object Notation) is a text serialization for complex data structures (especially those that cannot be represented as a table where CSV may be more common). 
    * In Python working with JSON data is relatively simple as it becomes a list-of-dictionaries nested in various ways.
    * Most APIs (e.g. Twitter, NYTimes, Genius, Qualtrics, Amazon MTurk, etc.) 


* This notebook works up from simple lists, dictionaries, list-of-dictionaries to a complex JSON example from the Twitter API.


-----

#### HISTORY

* 2/27/19 mbod - initial version for Python workshop

In [6]:
import json
import csv

## Lists

## Dictionaries

In [7]:
csv.DictReader?

In [2]:
with open('data/example_log_data.json') as fh:
    log_data = json.load(fh)

In [4]:
log_data.keys()

dict_keys(['log130.log', 'log131.log', 'log132.log', 'log133.log', 'log134.log', 'log135.log', 'log136.log', 'log137.log', 'log138.log', 'log139.log', 'log140.log', 'log141.log', 'log142.log', 'log143.log', 'log144.log', 'log145.log', 'log146.log', 'log147.log', 'log148.log', 'log149.log', 'log150.log', 'log151.log', 'log152.log', 'log153.log', 'log154.log', 'log155.log', 'log156.log', 'log157.log', 'log158.log', 'log159.log', 'log160.log', 'log161.log', 'log162.log', 'log163.log', 'log164.log', 'log165.log', 'log166.log', 'log167.log', 'log168.log', 'log169.log', 'log170.log', 'log171.log', 'log172.log'])

In [5]:
log_data['log166.log']

[{'share_rating': '3', 'trial_num': 1, 'video': 'a2r18h.mp4'},
 {'share_rating': '2', 'trial_num': 2, 'video': 'a1r11n.mp4'},
 {'share_rating': '2', 'trial_num': 3, 'video': 'a2d18h.mp4'},
 {'share_rating': '1', 'trial_num': 4, 'video': 'a2d13h.mp4'},
 {'share_rating': '1', 'trial_num': 5, 'video': 'a2r15h.mp4'},
 {'share_rating': '3', 'trial_num': 6, 'video': 'a1d10n.mp4'},
 {'share_rating': '1', 'trial_num': 7, 'video': 'a2d16h.mp4'},
 {'share_rating': '2', 'trial_num': 8, 'video': 'a1r12n.mp4'},
 {'share_rating': '1', 'trial_num': 9, 'video': 'a2d21h.mp4'},
 {'share_rating': '1', 'trial_num': 10, 'video': 'a2r23h.mp4'},
 {'share_rating': '2', 'trial_num': 11, 'video': 'a1d12n.mp4'},
 {'share_rating': '1', 'trial_num': 12, 'video': 'a1r09n.mp4'},
 {'share_rating': '1', 'trial_num': 13, 'video': 'a2r22h.mp4'},
 {'share_rating': '1', 'trial_num': 14, 'video': 'a1d11n.mp4'},
 {'share_rating': '3', 'trial_num': 15, 'video': 'a1d15n.mp4'},
 {'share_rating': '2', 'trial_num': 16, 'video': 