<a href="https://colab.research.google.com/github/jamesfloe/cap-comp215/blob/main/Lab3_BestWork.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

COMP 215 - LAB 3
----------------
#### Name: James Floe
#### Date: April 11 2022
This lab exercise is mostly a review of strings, tuples, lists, dictionaries, and functions.
We will also see how "list comprehension" provides a compact form for "list accumulator" algorithms.

As usual, the first code cell simply imports all the modules we'll be using:
the json library, which imports methods that manipulate JSON data,
the requests library which has methods that send HTML requests, and
the matplotlib.pyplot is a libary of methods which neatly plots data.


In [None]:
import json, requests 
import matplotlib.pyplot as plt
from pprint import pprint

We'll answer some questions about movies and TV shows with the IMDb database:  https://www.imdb.com/
> using the IMDb API:  https://imdb-api.com/api

You can register for your own API key, or simply use the one provided below.

Here's an example query:
 *   search for TV Series with title == "Lexx"

In [None]:
API_KEY = 'k_4hstehuv'

title = 'lexx'
url = "https://imdb-api.com/en/API/SearchTitle/{key}/{title}".format(key=API_KEY, title=title)

response = requests.request("GET", url, headers={}, data={}) # requests and gets JSON data from the following API URL.

data = json.loads(response.text)  # recall json.loads for lab 1

results = data['results']
pprint(results)


{'errorMessage': '',
 'expression': 'lexx',
 'results': [{'description': '(1996) (TV Series)',
              'id': 'tt0115243',
              'image': 'https://imdb-api.com/images/original/MV5BOGFjMzQyMTYtMjQxNy00NjAyLWI2OWMtZGVhMjk4OGM3ZjE5XkEyXkFqcGdeQXVyNzMzMjU5NDY@._V1_Ratio0.7273_AL_.jpg',
              'resultType': 'Title',
              'title': 'Lexx'},
             {'description': '(2008) (Video)',
              'id': 'tt1833738',
              'image': 'https://imdb-api.com/images/original/MV5BMjAyMTYzNjk4NV5BMl5BanBnXkFtZTcwNzE4MTU0NA@@._V1_Ratio0.7273_AL_.jpg',
              'resultType': 'Title',
              'title': 'Lexx'},
             {'description': '(2018)',
              'id': 'tt10800568',
              'image': 'https://imdb-api.com/images/original/MV5BZWY5ODYwNzYtMmIyMS00YzhhLTg0OTAtODM1M2I5YzkxMzY1XkEyXkFqcGdeQXVyMTEwNDU1MzEy._V1_Ratio0.7273_AL_.jpg',
              'resultType': 'Title',
              'title': 'Lexxy Roxx: Lexy 360 - Der Film'},
             

Next we extract the item we want from the data set by applying a "filter":

In [None]:
items = [item for item in results if item['title']=='Lexx' and "TV" in item['description']]
assert len(items) == 1
lexx = items[0]
pprint(lexx)

{'description': '(1996) (TV Series)',
 'id': 'tt0115243',
 'image': 'https://imdb-api.com/images/original/MV5BOGFjMzQyMTYtMjQxNy00NjAyLWI2OWMtZGVhMjk4OGM3ZjE5XkEyXkFqcGdeQXVyNzMzMjU5NDY@._V1_Ratio0.7273_AL_.jpg',
 'resultType': 'Title',
 'title': 'Lexx'}


## Exercise 1

In the code cell below, re-write the "list comprehension" above as a loop so you understand how it works.
Notice how the "conditional list comprehension" is a compact way to "filter" items of interest from a large data set.


In [None]:

items = []

for item in results:
  if item['title']=='Lexx' and 'TV' in item['description']:
    items.append(item)

lexx = items[0]

pprint(lexx)


{'description': '(1996) (TV Series)',
 'id': 'tt0115243',
 'image': 'https://imdb-api.com/images/original/MV5BOGFjMzQyMTYtMjQxNy00NjAyLWI2OWMtZGVhMjk4OGM3ZjE5XkEyXkFqcGdeQXVyNzMzMjU5NDY@._V1_Ratio0.7273_AL_.jpg',
 'resultType': 'Title',
 'title': 'Lexx'}


Notice that the `lexx` dictionary contains an `id` field that uniquely identifies this record in the database.

We can use the `id` to fetch other information about the TV series, for example,
*   get names of all actors in the TV Series Lexx


In [None]:
url = "https://imdb-api.com/en/API/FullCast/{key}/{id}".format(key=API_KEY, id=lexx['id'])
response = requests.request("GET", url, headers={}, data={})
data = json.loads(response.text)

actors = data['actors']
pprint(actors[:10])   # recall the slice operator (it's a long list!)

{'actors': [{'asCharacter': 'Stanley H. Tweedle / ... 61 episodes, 1996-2002',
             'id': 'nm0235978',
             'image': 'https://imdb-api.com/images/original/MV5BMTYxODI3OTM5Ml5BMl5BanBnXkFtZTgwMjM4ODc3MjE@._V1_Ratio1.3182_AL_.jpg',
             'name': 'Brian Downey'},
            {'asCharacter': 'Kai / ... 61 episodes, 1996-2002',
             'id': 'nm0573158',
             'image': 'https://imdb-api.com/images/original/MV5BMTY3MjQ4NzE0NV5BMl5BanBnXkFtZTgwNDE4ODc3MjE@._V1_Ratio1.3182_AL_.jpg',
             'name': 'Michael McManus'},
            {'asCharacter': '790 / ... 57 episodes, 1996-2002',
             'id': 'nm0386601',
             'image': 'https://imdb-api.com/images/original/MV5BMjMyMDM1NzgzNF5BMl5BanBnXkFtZTgwOTM4ODc3MjE@._V1_Ratio1.3182_AL_.jpg',
             'name': 'Jeffrey Hirschfield'},
            {'asCharacter': 'Xev Bellringer / ... 55 episodes, 1998-2002',
             'id': 'nm0781462',
             'image': 'https://imdb-api.com/images/original/M

Notice that the `asCharacter` field contains a number of different pieces of data as a single string, including the character name.
This kind of "free-form" text data is notoriously challenging to parse...

## Exercise 2

In the code cell below, write a python function that takes a string input (the text from `asCharacter` field)
and returns the number of episodes, if available, or None.

Hints:
* notice this is a numeric value followed by the word "episodes"
* recall str.split() and str.isdigit() and other string build-ins.

Add unit tests to cover as many cases from the `actors` data set above as you can.


In [None]:

def returnNumberOfEpisodes(asCharacter):
  episode = 0
  UnfltrWrds = asCharacter.split(' ')

  for item in UnfltrWrds:
    if (item.isdigit()):
      episode = int(item)

  return episode

# Function Test
numOfEpisodesList = [returnNumberOfEpisodes(item['asCharacter']) for item in actors]
print(numOfEpisodesList)


[61, 61, 57, 55, 46, 23, 16, 8, 13, 10, 7, 8, 3, 5, 6, 7, 7, 5, 5, 6, 5, 4, 5, 5, 5, 5, 5, 1, 4, 4, 4, 2000, 2000, 4, 4, 4, 4, 4, 3, 3, 1, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 2001, 3, 2, 2, 2, 2, 1999, 2, 2001, 2, 2, 2000, 2001, 2001, 2001, 2002, 1999, 2, 2002, 2, 2, 2001, 2, 2, 2, 1997, 2, 2, 2, 2, 2, 2, 2, 1999, 2, 2000, 1, 2001, 1, 1, 1999, 1, 1998, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 2001, 2002, 1997, 1997, 1999, 1999, 1999, 1999, 1999, 1999, 1999, 2000, 2000, 2001, 2002, 2002, 2002, 1999, 1999, 1999, 2000, 2000, 2000, 2000, 2001, 2001, 2002, 2002, 2002, 1996, 1997, 1997, 1997, 1998, 1998, 1999, 1999, 1999, 1999, 1999, 2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 1997, 1998, 1, 1999, 1999, 1999, 1999, 1999, 2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 1997, 1998, 1998, 1999, 1999, 1999, 1999, 1999, 1999, 2000, 2001, 2001, 2001, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 2002, 2002, 2002, 1997, 1998, 1998, 1999, 1999, 

## Exercise 3

In the code cell below, write a python function that takes a string input (the text from `asCharacter` field)
and returns just the character name.  This one may be even a little harder!

Hints:
* notice the character name is usually followed by a forward slash, `/`
* don't worry if your algorithm does not perfectly parse every character's name --
it may not really be possible to correclty handle all cases because the field format does not follow consistent rules

Add unit tests to cover as many cases from the `actors` data set above as you can.


In [None]:

def returnCharacterName(asCharacter):
  name = ''
  unFltrdWrd = asCharacter.split(' ')

  for item in unFltrdWrd:
    if(item == '/'):
      break
    else:
      name = name + ' ' + item
  return name

# Function Test
characterNames = [returnCharacterName(item['asCharacter']) for item in actors]
print(characterNames)

[' Stanley H. Tweedle', ' Kai', ' 790', ' Xev Bellringer', ' The Lexx 46 episodes, 1996-2002', ' Prince', ' Bunny Priest', ' Bound Man', ' Reginald J. Priest', ' Lyekka', ' Divine Predecessor', ' His Divine Shadow', ' Mothbreeder', ' Vlad', ' Megashadow Adjutant', ' Zev Bellringer 7 episodes, 1996-2000', ' Giggerota', ' The Time Prophet', ' Mantrid 5 episodes, 1998-2000', ' Fifi', ' Holo Cleric', ' Dougall 4 episodes, 2001-2002', ' Jood', ' Transport Major', ' Child', ' Holo Lawyr', ' Video Customs Officer', ' Mothbreeder', ' Petrif', ' Joshua', ' Holo Judge', ' Duke 5 episodes, 2000', ' May 5 episodes, 2000', ' Anchorman', ' Tina 4 episodes, 2001-2002', ' Lead Balloonist', ' Handler 2', ' Archaeologist', ' Matron', ' Brock', ' Blue Team Member', ' Megashadow Admiral 4 episodes, 1996-1997', ' Tem 4 episodes, 1996-1997', ' Wig Girl 4 episodes, 1996-1997', ' Correction Center Guard 4 episodes, 1996-1997', ' Honar 4 episodes, 1996-1997', ' Monk 4 episodes, 1996-1997', ' Scientist 4 episod


## Exercise 4

Using the functions you developed above, define 2 list comprehensions that:
* create list of 2 tuples with (actor name, character description) for actors in Lexx  (from `asCharacter` field)
* create a list of dictionaries, with keys:  'actor' and 'character' for the same data

Hint: this is a very simple problem - the goal is to learn how to build these lists using a comprehension.

Pretty print (pprint) your lists to visually verify the results.

In [None]:
actorDetails = [(item['name'], item['asCharacter']) for item in actors]
dictList = [{ 'actor': item['name'], 'character': item["asCharacter"] } for item in actors]

#Test
pprint(actorDetails)
pprint(dictList)








[('Brian Downey', 'Stanley H. Tweedle / ... 61 episodes, 1996-2002'),
 ('Michael McManus', 'Kai / ... 61 episodes, 1996-2002'),
 ('Jeffrey Hirschfield', '790 / ... 57 episodes, 1996-2002'),
 ('Xenia Seeberg', 'Xev Bellringer / ... 55 episodes, 1998-2002'),
 ('Tom Gallant', 'The Lexx 46 episodes, 1996-2002'),
 ('Nigel Bennett', 'Prince / ... 23 episodes, 2000-2002'),
 ('Patricia Zentilli', 'Bunny Priest / ... 16 episodes, 1999-2002'),
 ('Lex Gigeroff', 'Bound Man / ... 8 episodes, 1996-2002'),
 ('Rolf Kanies', 'Reginald J. Priest / ... 13 episodes, 2000-2002'),
 ('Louise Wischermann', 'Lyekka / ... 10 episodes, 1998-2002'),
 ('John Dunsworth', 'Divine Predecessor / ... 7 episodes, 1996-2002'),
 ('Walter Borden', 'His Divine Shadow / ... 8 episodes, 1996-2002'),
 ('David Albiston', 'Mothbreeder / ... 3 episodes, 1998-2002'),
 ('Minna Aaltonen', 'Vlad / ... 5 episodes, 2001-2002'),
 ('Clive Sweeney', 'Megashadow Adjutant / ... 6 episodes, 1996-2000'),
 ('Eva Habermann', 'Zev Bellringer 7 

**Lab 3: Excercise 1 from PDF**

Using what we learned from the textbook, define a simple SeriesActor class that defines state
variables related to an actor in a TV series. Your SeriesActor class should define at least 3 "state"
variables: name, character, episodes.

• Provide an __init__(self, ...) method to initialize a new SeriesActor object with
specific data values and a little code to test you class.

• Add a __str__(self) method that returns a nicely formatted string representation of the
object, plus a little code to test it.

In [None]:

class SeriesActor:

  def __init__(self, initActor, initCharacter, initEpisodes):
    'Initialize a new series actor object.'
    self.actor = initActor
    self.character = initCharacter
    self.episodes = initEpisodes

  def __str__(self):
    return 'name is ' + str(self.actor) + '\ncharacter is ' + str(self.character) + '\nnumber of episodes are ' + str(self.episodes) + '\n'

#Test
a = SeriesActor('bob', 'joe', 40)
print(a)

name is bob
character is joe
number of episodes are 40



**Lab 3: Excercise 2 from PDF**

In lab 2 we used both a tuple and then a dictionary to define each series-actor record.
Add a code cell that defines another list comprehension that builds each of the series-actor records as
an object.

• write a loop over your list of series-actor objects to print the character name of each. Notice
Comp 215 Lab 3
how the object dot-notation creates a "namespace" for the data items contained within.

Answer the following questions in a text block below this code:

• what advantage(s) does a class / object have over a simple tuple or dictionary for
representing a record?
• Is it possible to call the methods defined by a class in the “normal” way, where the object is
simply passed as the first parameter? How? (May take some experimenting).

In [None]:
a = [SeriesActor( item['name'], returnCharacterName(item['asCharacter']), returnNumberOfEpisodes(item['asCharacter'])) for item in actors]

#Test
for i in a:
  print(i)

# Creating an object has the advantage of being much more organized and legible
# When one creates an object, one can set several attributes for the object
# and create algorithms to assign values to those attributes.
# One can later easily access those organized and labeled attributes using dot notation.
# Tuples and dictionaries are much less organized. 
# The data within tuples cannot be labeled or organized under a larger, 'object or class.'
# Dictionaries do allow some labeling of data such as 'keys.' 
# However, dictionaries cannot be organized under specific 'classes.'
# Both objects and dictionaries also cannot have their own defined functions, which objects can.



name is Brian Downey
character is  Stanley H. Tweedle
number of episodes are 61

name is Michael McManus
character is  Kai
number of episodes are 61

name is Jeffrey Hirschfield
character is  790
number of episodes are 57

name is Xenia Seeberg
character is  Xev Bellringer
number of episodes are 55

name is Tom Gallant
character is  The Lexx 46 episodes, 1996-2002
number of episodes are 46

name is Nigel Bennett
character is  Prince
number of episodes are 23

name is Patricia Zentilli
character is  Bunny Priest
number of episodes are 16

name is Lex Gigeroff
character is  Bound Man
number of episodes are 8

name is Rolf Kanies
character is  Reginald J. Priest
number of episodes are 13

name is Louise Wischermann
character is  Lyekka
number of episodes are 10

name is John Dunsworth
character is  Divine Predecessor
number of episodes are 7

name is Walter Borden
character is  His Divine Shadow
number of episodes are 8

name is David Albiston
character is  Mothbreeder
number of episodes 

**Lab 3: Excercise 3 from PDF**
Let’s bundle all the code we’ve been experimenting with into a few classes that “encapsulate” what
we’ve learned about IMDb API record formats.

• Define a TvSeries class. The init method takes the id for an IMDb TV Series, fetches the record
and stores some key values (like id, title, date). You should define a helper method to parse
the date from the description field.

• Define an Actor class. The init method takes the id for an IMDb Actor, fetches the record, and
stores some key values (like id, name, birthDate)

• Define a Character class. The init method takes a TVSeries object and an Actor object, along
with name of the character played by that Actor, and the number of episodes they were in.

• Add an accessor method to your TvSeries class to get a rating for the series. Take one
parameter, the type of rating, and return the rating, if available, otherwise None.

• Add an accessor method to your TvSeries class to get a list of Character objects for the series.
You will need the parse functions you wrote in Lab 2, consider integrating them here as helper
methods.

Add a new code block with a little code that tests all of your classes are working as expected.
Feel free to have fun extending these models in various ways – the possibilities are endless.
Notice that what we are doing here is “wrapping” the http web API in a simpler, easy to use Python
API that hides all the details of how the web API works. This is an example of one of the most
important principles in Computer Science: The Principle of Information Hiding



In [None]:

class TvSeries:

  def getTitle(jsonText):
    fullTitle = jsonText['fullTitle']
    name = ''
    unFltrdWrd = fullTitle.split(' ')
    for item in unFltrdWrd:
      if('(' in item):
        break
      else:
        name = name + ' ' + item
    return name
    
  def __init__(self, TvID):
    url = "https://imdb-api.com/en/API/Title/{key}/{title}".format(key=API_KEY, title=TvID)
    response = requests.request("GET", url, headers={}, data={})
    data = json.loads(response.text)

    self.Title = TvSeries.getTitle(data)
    self.id = data['id']
    self.date = data['year']
    self.data = data

  def __str__(self):
    return 'Title: ' + str(self.Title) + '\nid: ' + str(self.id) + '\ndate: ' + str(self.date) + '\n'

  def getRating(self):
    return self.data['imDbRating']

  def createCharacterObjects(self):
    actors = self.data['actorList']
    objectList = [Character( TvSeries(self.id), Actor(item['id'])) for item in actors]
    return objectList

class Actor:

  def __init__(self, ActorID):
    url = "https://imdb-api.com/en/API/Name/{key}/{name}".format(key=API_KEY, name=ActorID)
    response = requests.request("GET", url, headers={}, data={})
    data = json.loads(response.text)

    self.name =  data['name']
    self.id = data['id']
    self.BirthDate = data['birthDate']

  def __str__(self):
    return 'Name: ' + str(self.name) + '\nid: ' + str(self.id) + '\nBirthday: ' + str(self.BirthDate) + '\n'

class Character:

  def __init__(self, TvSeries, Actor):
    self.character = ''
    self.numberOfEpisodes = ''
    data = TvSeries.data
    List = [{ 'actor': item['name'], 'character': item["asCharacter"] } for item in data['actorList']]

    for item in List:
      if (Actor.name in item['actor']):
        self.character = returnCharacterName(item['character'])
        self.numberOfEpisodes = returnNumberOfEpisodes(item['character'])

  def __str__(self):
    return 'Character: ' + str(self.character) + '\nNumber of Episodes which they appear in: ' + str(self.numberOfEpisodes) + '\n'

a = TvSeries('tt0115243')
print(a)
b = Actor('nm0235978')
print(b)
c = Character(TvSeries('tt0115243'), Actor('nm0235978'))
print(c)
TvSeries('tt0115243').getRating
d = TvSeries('tt0115243').createCharacterObjects()
for i in d:
  print(i)










Title:  Lexx
id: tt0115243
date: 1996

Name: Brian Downey
id: nm0235978
Birthday: 1944-10-31



AttributeError: ignored