# Guidelines for Project 1

This document contains guidelines, requirements, and suggestions for Project 1.

## Team Effort

Before anything, remember that Projects are a **group effort**: Working closely with your teammates is a requirement. This both helps teach real-world collaborative workflows, and enables you to tackle more difficult problems than you'd be able to working alone. 

In other words, working in groups allows you to **work smart** and **dream big**. Take advantage of it!

## Project Proposal

Before you start writing any code, your group should outline the scope and purpose of your project. This helps provide direction and prevent [scope creep](https://en.wikipedia.org/wiki/Scope_creep).

Write this as a brief summary of your interests and intent, including:

* The kind of data you'd like to work with/field you're interested in (e.g., geodata, weather data, etc.)

* The kinds of questions you'll be asking of that data

* Possible source for such data

In other words, write down what kind of data you plan to work with, and what kinds of questions you'd like to ask of it. This constitutes your Project Proposal/Outline, and should look something like this:

> Our project is to uncover patterns in criminal activity around Los Angeles. We'll examine relationships between types of crime and location; crime rates and times of day; trends in crime rates over the course of the year; and related questions, as the data admits.

## Finding Data

Once your group has written an outline, it's time to start hunting for data. You are free to use data from any source, but we recommend the following curated sources of high-quality data:

* [data.world](https://data.world/)

* [Kaggle](https://www.kaggle.com/)

* [Data.gov](https://www.data.gov)

Chances are you'll have to update your Project Outline as you explore the available data. **This is fine**&mdash;adjustments like this are part of the process! Just make sure everyone in the group is up-to-speed on the goals of the project as you make changes.

## Data Cleanup &amp; Analysis

With data in hand, it's time to tackle development and analysis. This is where the fun starts!

Inevitably, the analysis process can be broken into two broad phases: **Exploration &amp; Cleanup** and **Analysis** proper.

As you've learned, you'll need to explore, clean, and reformat your data before you can begin to answer your research questions. We recommend keeping track of these exploration and cleanup steps in a dedicated Jupyter Notebook, both for organization's sake and to make it easier to  present your work later.

Similarly, after you've massaged your data and are ready to start crunching numbers, you should keep track of your work in a Jupyter Notebook dedicated specifically to analysis.

During both phases, **don't forget to include plots**! Don't make the mistake of waiting to build figures until you're preparing your presentation. Creating them along the way can reveal insights and interesting trends in the data that you might not notice otherwise.

Finally, be sure that your projects meet the [technical requirements](TechnicalRequirements.md).

## Presentation

After you've analyzed your data to your satisfaction, you'll put together a presentation to show of your work, explain your process, and discuss your conclusions.

This presentation will be delivered as a slideshow, and should give your classmates and instructional staff an overview of your work. PowerPoint, Keynote, and Google Slides are all acceptable for building slides. 

As long as your slides meet the [presentation requirements](PresentationRequirements.md), you are free to structure the presentation however you wish, but students are often successful with the format laid out in the [presentation guidelines](PresentationGuidelines.md).

## Sample Ideas &amp; Inspiration



- - - 

### Copyright

Coding Boot Camp &copy; 2017. All Rights Reserved.


In [2]:

import json
import requests
from time import sleep
import pandas as pd

api_key = '1bb6cfd646261b1acda61748ca2bb5a7'



In [3]:

url = 'http://api.themoviedb.org/3'
search_type = '/search/tv'
query = "twilight%20zone"
dat = requests.get(url + search_type + "?page=1&query=" + query + "&api_key=" + api_key).json()

dat


{'page': 1,
 'results': [{'backdrop_path': '/lKdKgLoLPmnHoUXunrcSAuBhAJx.jpg',
   'first_air_date': '1959-10-02',
   'genre_ids': [35, 18, 9648, 10765],
   'id': 6357,
   'name': 'The Twilight Zone',
   'origin_country': ['US'],
   'original_language': 'en',
   'original_name': 'The Twilight Zone',
   'overview': 'A series of unrelated stories containing drama, psychological thriller, fantasy, science fiction, suspense, and/or horror, often concluding with a macabre or unexpected twist.',
   'popularity': 10.39861,
   'poster_path': '/ehx3JR5kJS9JUs4wEc2zcRGbHFR.jpg',
   'vote_average': 8.19,
   'vote_count': 168},
  {'backdrop_path': '/1L8JCGfpIp5BK5yiwrddW7ztLAL.jpg',
   'first_air_date': '1985-09-27',
   'genre_ids': [10765, 18],
   'id': 1918,
   'name': 'The Twilight Zone',
   'origin_country': ['US'],
   'original_language': 'en',
   'original_name': 'The Twilight Zone',
   'overview': "The Twilight Zone is the first of two revivals of Rod Serling's acclaimed 1950/60s television 

In [4]:
show_id = dat['results'][0]['id']
show_id

6357

In [5]:
search_type = f'/tv/{show_id}'

dat = requests.get(url + search_type + "?page=1&query=" + query + "&api_key=" + api_key).json()

In [6]:
seasons = []
for season in dat['seasons']:
    seasons.append({'season_num': season['season_number'], 'episode_count': season['episode_count']})
print(seasons[1])

{'season_num': 1, 'episode_count': 37}


In [7]:
search_type = f'/tv/{show_id}/credits'

dat = requests.get(url + search_type + "?page=1&query=" + query + "&api_key=" + api_key).json()

main_cast = []
for character in dat['cast']:
    main_cast.append(character['name'])
# 'Majel Barrett', 'Bill Blackburn','Frank da Vinci' not really main cast?
main_cast


['Rod Serling']

In [10]:
guest_list = []
guest_ids = []
id_to_name = {}
#season_num = 1
#episode_num = 1
for season in seasons:
    season_num = season['season_num']
    episode_count = season['episode_count']
    for episode_num in range(1,episode_count):
        print(season_num, ':' , episode_num, end = '\t')
        sleep(.3)
        dat = requests.get(f'https://api.themoviedb.org/3/tv/{show_id}/season/{season_num}/episode/{episode_num}?api_key=1bb6cfd646261b1acda61748ca2bb5a7&language=en-US').json()
        try:
            for star in dat['guest_stars']:
                if star['name'] not in guest_list and star['name'] not in main_cast:
                    guest_list.append(star['name'])
                    guest_ids.append(star['id'])
                    id_to_name[star['id']] = star['name']
                    
        except KeyError:
            print('\n uh-oh', season_num, ':' , episode_num)
    print()
print(guest_ids)

0 : 1	0 : 2	0 : 3	0 : 4	0 : 5	0 : 6	0 : 7	0 : 8	0 : 9	0 : 10	0 : 11	0 : 12	0 : 13	0 : 14	0 : 15	0 : 16	0 : 17	0 : 18	0 : 19	0 : 20	
1 : 1	1 : 2	1 : 3	1 : 4	1 : 5	1 : 6	1 : 7	1 : 8	1 : 9	1 : 10	1 : 11	1 : 12	1 : 13	1 : 14	1 : 15	1 : 16	1 : 17	1 : 18	1 : 19	1 : 20	1 : 21	1 : 22	1 : 23	1 : 24	1 : 25	1 : 26	1 : 27	1 : 28	1 : 29	1 : 30	1 : 31	1 : 32	1 : 33	1 : 34	1 : 35	1 : 36	
2 : 1	2 : 2	2 : 3	2 : 4	2 : 5	2 : 6	2 : 7	2 : 8	2 : 9	2 : 10	2 : 11	2 : 12	2 : 13	2 : 14	2 : 15	2 : 16	2 : 17	2 : 18	2 : 19	2 : 20	2 : 21	2 : 22	2 : 23	2 : 24	2 : 25	2 : 26	2 : 27	2 : 28	
3 : 1	3 : 2	3 : 3	3 : 4	3 : 5	3 : 6	3 : 7	3 : 8	3 : 9	3 : 10	3 : 11	3 : 12	3 : 13	3 : 14	3 : 15	3 : 16	3 : 17	3 : 18	3 : 19	3 : 20	3 : 21	3 : 22	3 : 23	3 : 24	3 : 25	3 : 26	3 : 27	3 : 28	3 : 29	3 : 30	3 : 31	3 : 32	3 : 33	3 : 34	3 : 35	3 : 36	
4 : 1	4 : 2	4 : 3	4 : 4	4 : 5	4 : 6	4 : 7	4 : 8	4 : 9	4 : 10	4 : 11	4 : 12	4 : 13	4 : 14	4 : 15	4 : 16	4 : 17	
5 : 1	5 : 2	5 : 3	5 : 4	5 : 5	5 : 6	5 : 7	5 : 8	5 : 9	5 : 10	5 : 11	5 : 12	5 : 13

In [11]:
guest_list
id_to_name

{726: 'Jack Weston',
 863: 'Orson Bean',
 923: 'Dean Stockwell',
 1107: 'R.G. Armstrong',
 1153: 'John A. Alonzo',
 1748: 'William Shatner',
 1749: 'Leonard Nimoy',
 1751: 'James Doohan',
 1752: 'George Takei',
 1935: 'Buddy Ebsen',
 1936: 'Martin Balsam',
 1937: 'Mickey Rooney',
 1943: 'John McGiver',
 1947: 'Stanley Adams',
 2081: 'George Murdock',
 2097: 'Ted de Corsia',
 2101: 'Percy Helton',
 2226: 'Sydney Pollack',
 2314: 'Peter Falk',
 2641: 'Martin Landau',
 2643: 'Josephine Hutchinson',
 2644: 'Philip Ober',
 2645: 'Adam Williams',
 2646: 'Edward Platt',
 2651: 'Edward Binns',
 2652: 'Ken Lynch',
 2672: 'Jack Carson',
 2778: 'Dennis Hopper',
 2782: 'Ian Wolfe',
 3014: 'Val Avery',
 3087: 'Robert Duvall',
 3090: 'Richard Conte',
 3142: 'John Marley',
 3160: 'Nehemiah Persoff',
 3163: 'George E. Stone',
 3262: 'James Flavin',
 3343: 'Jay Adler',
 3346: 'Dorothy Adams',
 3366: 'Gladys Cooper',
 3461: 'Jack Albertson',
 3640: 'Larry Gates',
 3641: 'Vaughn Taylor',
 3798: 'Pat Hing

In [12]:
movies = {}
count = 0
for id_num in guest_ids:
    #id_num = 83913
    #index = 0
    url = f'https://api.themoviedb.org/3/person/{id_num}/movie_credits?api_key=1bb6cfd646261b1acda61748ca2bb5a7&language=en-US'
    dat = requests.get(url).json()
    sleep(.3)
    try:
        for movie in dat['cast']:
            #print(json.dumps(dat, indent = 2, sort_keys= True))
            #print(dat['cast'])
            if movie['id'] not in movies:
                movies[movie['id']] = {'Movie': movie['original_title'], 'movie_id': movie['id'], 'guest_names': [] , 'guest_ids': [], 'count': 0}
                
            movies[movie['id']]['guest_names'].append(id_to_name[id_num])
            movies[movie['id']]['guest_ids'].append(id_num)
            movies[movie['id']]['count'] += 1
    except KeyError:
        print('ERROR', url)
        #so far, produces one error for an actor who appears in no movies
    count += 1
    print(count, '/', len(guest_ids))
print(movies)

1 / 623
2 / 623
3 / 623
4 / 623
5 / 623
6 / 623
7 / 623
8 / 623
9 / 623
10 / 623
11 / 623
12 / 623
13 / 623
14 / 623
15 / 623
16 / 623
17 / 623
18 / 623
19 / 623
20 / 623
21 / 623
22 / 623
23 / 623
24 / 623
25 / 623
26 / 623
27 / 623
28 / 623
29 / 623
30 / 623
31 / 623
32 / 623
33 / 623
34 / 623
35 / 623
36 / 623
37 / 623
38 / 623
39 / 623
40 / 623
41 / 623
42 / 623
43 / 623
44 / 623
45 / 623
46 / 623
47 / 623
48 / 623
49 / 623
50 / 623
51 / 623
52 / 623
53 / 623
54 / 623
55 / 623
56 / 623
57 / 623
58 / 623
59 / 623
60 / 623
61 / 623
62 / 623
63 / 623
64 / 623
65 / 623
66 / 623
67 / 623
68 / 623
69 / 623
70 / 623
71 / 623
72 / 623
73 / 623
ERROR https://api.themoviedb.org/3/person/1196701/movie_credits?api_key=1bb6cfd646261b1acda61748ca2bb5a7&language=en-US
74 / 623
75 / 623
76 / 623
77 / 623
78 / 623
79 / 623
80 / 623
81 / 623
82 / 623
83 / 623
84 / 623
85 / 623
86 / 623
87 / 623
88 / 623
89 / 623
90 / 623
91 / 623
92 / 623
93 / 623
94 / 623
95 / 623
96 / 623
97 / 623
98 / 623
99 / 62

In [13]:
df = pd.DataFrame(movies).T

df = df.set_index('Movie')
df = df.sort_values('count',ascending = False)
df.to_csv('first_returns.csv')
df

Unnamed: 0_level_0,count,guest_ids,guest_names,movie_id
Movie,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
The Greatest Story Ever Told,15,"[5833, 2641, 3090, 3160, 7505, 12355, 135066, ...","[Ed Wynn, Martin Landau, Richard Conte, Nehemi...",2428
North by Northwest,13,"[115460, 2652, 2641, 2651, 2645, 18870, 114961...","[Malcolm Atterbury, Ken Lynch, Martin Landau, ...",213
Nevada Smith,11,"[179314, 2641, 2097, 31353, 3014, 8260, 1947, ...","[Merritt Bohn, Martin Landau, Ted de Corsia, S...",5921
The Man Who Shot Liberty Valance,11,"[7303, 8516, 8260, 4078, 18391, 7520, 89938, 1...","[Vera Miles, John Carradine, Strother Martin, ...",11697
Pocketful of Miracles,10,"[14452, 96722, 13874, 40623, 4965, 2314, 3163,...","[Jerome Cowan, Byron Foulger, Marc Cavell, Hay...",248
Psycho,10,"[1936, 71146, 3641, 7303, 53010, 19111, 14063,...","[Martin Balsam, Ted Knight, Vaughn Taylor, Ver...",539
Spartacus,10,"[2097, 151528, 7074, 1217498, 15949, 14507, 14...","[Ted de Corsia, Logan Field, Peter Brocco, Jac...",967
The Big Heat,10,"[2645, 96284, 1358454, 97835, 15693, 18391, 75...","[Adam Williams, Alexander Scourby, Ezelle Poul...",14580
"It's a Mad, Mad, Mad, Mad World",10,"[29579, 13593, 2314, 93664, 8635, 14966, 19400...","[Charles Lane, Jonathan Winters, Peter Falk, J...",11576
The Satan Bug,10,"[77136, 98458, 19111, 12309, 33479, 14063, 987...","[John Clarke, Russ Bender, John Anderson, Anne...",27709


In [None]:
#which two star trek actors worked with each other the most

In [None]:
#compare which three star trek actors worked with each other the most

In [None]:
#which star trek actor worked with the most other star trek actors