<a href="https://colab.research.google.com/github/lukaradonjic21/cap-comp215/blob/main/labs/lab3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

COMP 215 - LAB 3
----------------
#### Name: Luka Radonjic
#### Date: 1/25/2022
This lab exercise is mostly a review of strings, tuples, lists, dictionaries, and functions.
We will also see how "list comprehension" provides a compact form for "list accumulator" algorithms.

As usual, the first code cell simply imports all the modules we'll be using...

In [1]:
import json, requests
import matplotlib.pyplot as plt
from pprint import pprint

We'll answer some questions about movies and TV shows with the IMDb database:  https://www.imdb.com/
> using the IMDb API:  https://imdb-api.com/api

You can register for your own API key, or simply use the one provided below.

Here's an example query:
 *   search for TV Series with title == "Lexx"

In [2]:
API_KEY = 'k_ynffhhna'

title = 'lexx'
url = "https://imdb-api.com/en/API/SearchTitle/{key}/{title}".format(key=API_KEY, title=title)

response = requests.request("GET", url, headers={}, data={})

data = json.loads(response.text)  # recall json.loads for lab 1

results = data['results']
pprint(results)

[{'description': '(1996) (TV Series)',
  'id': 'tt0115243',
  'image': 'https://imdb-api.com/images/original/MV5BOGFjMzQyMTYtMjQxNy00NjAyLWI2OWMtZGVhMjk4OGM3ZjE5XkEyXkFqcGdeQXVyNzMzMjU5NDY@._V1_Ratio0.7273_AL_.jpg',
  'resultType': 'Title',
  'title': 'Lexx'},
 {'description': '(2008) (Video)',
  'id': 'tt1833738',
  'image': 'https://imdb-api.com/images/original/MV5BMjAyMTYzNjk4NV5BMl5BanBnXkFtZTcwNzE4MTU0NA@@._V1_Ratio0.7273_AL_.jpg',
  'resultType': 'Title',
  'title': 'Lexx'},
 {'description': '(2018)',
  'id': 'tt10800568',
  'image': 'https://imdb-api.com/images/original/MV5BZWY5ODYwNzYtMmIyMS00YzhhLTg0OTAtODM1M2I5YzkxMzY1XkEyXkFqcGdeQXVyMTEwNDU1MzEy._V1_Ratio0.7273_AL_.jpg',
  'resultType': 'Title',
  'title': 'Lexxy Roxx: Lexy 360 - Der Film'},
 {'description': '(2014) (Short)',
  'id': 'tt4396272',
  'image': 'https://imdb-api.com/images/original/nopicture.jpg',
  'resultType': 'Title',
  'title': 'Lexxxus'},
 {'description': '(2016) (Video) aka "Lexxzibé Inonime Nirek & Elman

Next we extract the item we want from the data set by applying a "filter":

In [3]:
items = [item for item in results if item['title']=='Lexx' and "TV" in item['description']]
assert len(items) == 1
lexx = items[0]
pprint(lexx)

{'description': '(1996) (TV Series)',
 'id': 'tt0115243',
 'image': 'https://imdb-api.com/images/original/MV5BOGFjMzQyMTYtMjQxNy00NjAyLWI2OWMtZGVhMjk4OGM3ZjE5XkEyXkFqcGdeQXVyNzMzMjU5NDY@._V1_Ratio0.7273_AL_.jpg',
 'resultType': 'Title',
 'title': 'Lexx'}


## Exercise 1

In the code cell below, re-write the "list comprehension" above as a loop so you understand how it works.
Notice how the "conditional list comprehension" is a compact way to "filter" items of interest from a large data set.


In [4]:
movies = []
for media in results:
  if 'TV' in media['description'] and media['title'] == 'Lexx':
    movies.append(media)
pprint(movies)

[{'description': '(1996) (TV Series)',
  'id': 'tt0115243',
  'image': 'https://imdb-api.com/images/original/MV5BOGFjMzQyMTYtMjQxNy00NjAyLWI2OWMtZGVhMjk4OGM3ZjE5XkEyXkFqcGdeQXVyNzMzMjU5NDY@._V1_Ratio0.7273_AL_.jpg',
  'resultType': 'Title',
  'title': 'Lexx'}]


Notice that the `lexx` dictionary contains an `id` field that uniquely identifies this record in the database.

We can use the `id` to fetch other information about the TV series, for example,
*   get names of all actors in the TV Series Lexx


In [5]:
url = "https://imdb-api.com/en/API/FullCast/{key}/{id}".format(key=API_KEY, id=lexx['id'])
response = requests.request("GET", url, headers={}, data={})
data = json.loads(response.text)

actors = data['actors']
pprint(actors[:10])   # recall the slice operator (it's a long list!)

[{'asCharacter': 'Stanley H. Tweedle / ... 61 episodes, 1996-2002',
  'id': 'nm0235978',
  'image': 'https://imdb-api.com/images/original/MV5BMTYxODI3OTM5Ml5BMl5BanBnXkFtZTgwMjM4ODc3MjE@._V1_Ratio1.3182_AL_.jpg',
  'name': 'Brian Downey'},
 {'asCharacter': 'Kai / ... 61 episodes, 1996-2002',
  'id': 'nm0573158',
  'image': 'https://imdb-api.com/images/original/MV5BMTY3MjQ4NzE0NV5BMl5BanBnXkFtZTgwNDE4ODc3MjE@._V1_Ratio1.3182_AL_.jpg',
  'name': 'Michael McManus'},
 {'asCharacter': '790 / ... 57 episodes, 1996-2002',
  'id': 'nm0386601',
  'image': 'https://imdb-api.com/images/original/MV5BMjMyMDM1NzgzNF5BMl5BanBnXkFtZTgwOTM4ODc3MjE@._V1_Ratio1.3182_AL_.jpg',
  'name': 'Jeffrey Hirschfield'},
 {'asCharacter': 'Xev Bellringer / ... 55 episodes, 1998-2002',
  'id': 'nm0781462',
  'image': 'https://imdb-api.com/images/original/MV5BMTk2MDQ4NzExOF5BMl5BanBnXkFtZTcwOTMyNzcyMQ@@._V1_Ratio0.7273_AL_.jpg',
  'name': 'Xenia Seeberg'},
 {'asCharacter': 'The Lexx 46 episodes, 1996-2002',
  'id': 'nm

Notice that the `asCharacter` field contains a number of different pieces of data as a single string, including the character name.
This kind of "free-form" text data is notoriously challenging to parse...

## Exercise 2

In the code cell below, write a python function that takes a string input (the text from `asCharacter` field)
and returns the number of episodes, if available, or None.

Hints:
* notice this is a numeric value followed by the word "episodes"
* recall str.split() and str.isdigit() and other string build-ins.

Add unit tests to cover as many cases from the `actors` data set above as you can.


In [6]:
def infoToEpisode(asCharString):
  asCharSplit = asCharString.split()
  digits = []
  for i in asCharSplit:
    if i.isdigit():
      digits.append(i)
  if digits == []:
    return(None)
  else:
    return(int(digits[-1]))

assert infoToEpisode('abc def') == None
assert infoToEpisode('Stanley H. Tweedle / ... 61 episodes, 1996-2002') == 61
assert infoToEpisode('Kai / ... 61 episodes, 1996-2002') == 61
assert infoToEpisode('790 / ... 57 episodes, 1996-2002') == 57




"""
The remaining functions are not part of the lab, but I am leaving them to help me in the future
"""
def episodes(asCharString):
  'input is asCharacter string, outputs episodes or returns none'
  infoSplit = asCharString.split()
  eps = infoSplit[-3:-1]
  if eps[0].isdigit():
    print('This actor is in', eps[0], eps[1])
  else:
    return None

def charEps(person):
  'input is imdb api media dictionary, outputs episodes or returns none'
  infoSplit = person['asCharacter'].split()
  eps = infoSplit[-3:-1]
  if eps[0].isdigit():
    print('This actor is in', eps[0], eps[1])
  else:
    return None

#for person in actors:
  '''loops charEps for every actor in Lexx'''
  #charEps(person)

## Exercise 3

In the code cell below, write a python function that takes a string input (the text from `asCharacter` field)
and returns just the character name.  This one may be even a little harder!

Hints:
* notice the character name is usually followed by a forward slash, `/`
* don't worry if your algorithm does not perfectly parse every character's name --
it may not really be possible to correclty handle all cases because the field format does not follow consistent rules

Add unit tests to cover as many cases from the `actors` data set above as you can.


In [7]:
def infoToCharacter(asCharString):
  asCharSplit = asCharString.split()
  #print(asCharSplit)
  charName = []
  for i in asCharSplit:
    if i == '/':
      return(' '.join(charName))
    else:
      charName.append(i)

assert infoToCharacter('test test 123') == None
assert infoToCharacter('Stanley H. Tweedle / ... 61 episodes, 1996-2002') == 'Stanley H. Tweedle'
assert infoToCharacter('Kai / ... 61 episodes, 1996-2002') == 'Kai'
assert infoToCharacter('790 / ... 57 episodes, 1996-2002') == '790'
assert infoToCharacter('Lyekka / ... 10 episodes, 1998-2002') == 'Lyekka'


## Exercise 4

Using the functions you developed above, define 2 list comprehensions that:
* create list of 2 tuples with (actor name, character description) for actors in Lexx  (from `asCharacter` field)
* create a list of dictionaries, with keys:  'actor' and 'character' for the same data

Hint: this is a very simple problem - the goal is to learn how to build these lists using a comprehension.

Pretty print (pprint) your lists to visually verify the results.

In [8]:
actNameAsChar = [(person['name'], person['asCharacter']) for person in actors]
#pprint(actNameAsChar[:10])


listyDict = [{'actor': person[0], 'character': person[1]} for person in actNameAsChar]
pprint(listyDict[:10])

[{'actor': 'Brian Downey',
  'character': 'Stanley H. Tweedle / ... 61 episodes, 1996-2002'},
 {'actor': 'Michael McManus', 'character': 'Kai / ... 61 episodes, 1996-2002'},
 {'actor': 'Jeffrey Hirschfield',
  'character': '790 / ... 57 episodes, 1996-2002'},
 {'actor': 'Xenia Seeberg',
  'character': 'Xev Bellringer / ... 55 episodes, 1998-2002'},
 {'actor': 'Tom Gallant', 'character': 'The Lexx 46 episodes, 1996-2002'},
 {'actor': 'Nigel Bennett', 'character': 'Prince / ... 23 episodes, 2000-2002'},
 {'actor': 'Patricia Zentilli',
  'character': 'Bunny Priest / ... 16 episodes, 1999-2002'},
 {'actor': 'Lex Gigeroff',
  'character': 'Bound Man / ... 8 episodes, 1996-2002'},
 {'actor': 'Rolf Kanies',
  'character': 'Reginald J. Priest / ... 13 episodes, 2000-2002'},
 {'actor': 'Louise Wischermann',
  'character': 'Lyekka / ... 10 episodes, 1998-2002'}]


Lab 3: Exercise 1

In [44]:
class SeriesActor:
  def __init__(self, name, character, episodes):
    self.name = name
    self.character = character
    self.episodes = episodes
  def __str__ (self):
    test = "{0.name} as {0.character} for {0.episodes}"
    return test.format(self)
    #return f'{self.name}, plays {self.character} ({self.episodes} episodes).'

actor = SeriesActor('Caroll Spinney', 'Big Bird', 247)
print('Actor:', actor)

Actor: Caroll Spinney as Big Bird for 247


Lab 3: Exercise 2

In [72]:
lexx_actors = [              
      SeriesActor(
          name=r['name'],
          character=r['asCharacter'][:10],
          episodes=infoToEpisode(r['asCharacter'])
      ) for r in actors
]

for i in range(10):
  print(lexx_actors[i])

Brian Downey as Stanley H. for 61
Michael McManus as Kai / ...  for 61
Jeffrey Hirschfield as 790 / ...  for 57
Xenia Seeberg as Xev Bellri for 55
Tom Gallant as The Lexx 4 for 46
Nigel Bennett as Prince / . for 23
Patricia Zentilli as Bunny Prie for 16
Lex Gigeroff as Bound Man  for 8
Rolf Kanies as Reginald J for 13
Louise Wischermann as Lyekka / . for 10


Lab 3: Exercise 3

In [None]:
class TvSeries:
  def __init__(self, ID):
    self.id = ID