# Felipe Castillo
# Data Preparation
# Movie DB
# 02/14/2022

# Lesson 7: Advanced web scraping and data gathering
## Activity 2: Build your own movie database by reading from an API
### This notebook does the following
* Retrieves and prints basic data about a movie (title entered by user) from the web (OMDB database)
* If a poster of the movie could be found, it downloads the file and saves at a user-specified location

In [18]:
import urllib.request, urllib.parse, urllib.error
import json
import os 

os.chdir("C:\DataScience_DSC_540\Week9_10")

### Load the secret API key (you have to get one from OMDB website and use that, 1000 daily limit) from a JSON file, stored in the same folder into a variable
Hint: Use **`json.loads()`**

#### Note: The following cell will not be executed in the solution notebook because the author cannot give out his private API key. 
#### Students/users/instructor will need to obtain a key and store in a JSON file. 
#### For the code's sake, we are calling this file `APIkeys.json`. But you need to store your own key in this file.
#### An example file called `"APIkey_Bogus_example.json"` is given along with the notebook. Just change the code in this file and rename as `APIkeys.json`. The file name does not matter of course.

In [19]:
# Write your code here

#File name moniFile contains text api
with open('omniFile.json') as file:
    keys = json.load(file)
    omdbapi = keys['OMDBapi']
    

### The final URL to be passed should look like: http://www.omdbapi.com/?t=movie_name&apikey=secretapikey 
Do the following,
* Assign the OMDB portal (http://www.omdbapi.com/?) as a string to a variable `serviceurl` (don't miss the `?`)
* Create a variable `apikey` with the last portion of the URL ("&apikey=secretapikey"), where `secretapikey` is your own API key (an actual code)
* The movie name portion i.e. "t=movie_name" will be addressed later

In [41]:
# Write your code here


service_url = 'http://www.omdbapi.com/?'
api_key = '&apikey='+omdbapi



#print (api_key)


&apikey=f43047ff


### Write a utility function `print_json` to print nicely the movie data from a JSON file (which we will get from the portal)
Here are the keys of a JSON file,

'Title', 'Year', 'Rated', 'Released', 'Runtime', 'Genre', 'Director', 'Writer', 'Actors', 'Plot', 'Language','Country', 'Awards', 'Ratings', 'Metascore', 'imdbRating', 'imdbVotes', 'imdbID'

In [138]:
# Write your code here


def print_json(jsonObject):
    movie_title_keys = ['Title', 'Year', 'Rated', 'Released', 'Runtime', 
                 'Genre', 'Director', 'Writer', 'Actors', 'Plot', 
                 'Language','Country', 'Awards', 'Ratings', 'Metascore', 
                 'imdbRating', 'imdbVotes', 'imdbID']
    
    #Looking through all movie titles 
    for key in  movie_title_keys:
        #if the item contains the key value it will print it out by key
        if key in list(jsonObject.keys()):
            #example out put Tile : Titanic
            print("{0}: {1}".format(key ,jsonObject[key]))






### Write a utility function to download a poster of the movie based on the information from the jason dataset and save in your local folder

* Use `os` module
* The poster data is stored in the JSON key 'Poster'
* You may want to split the name of the Poster file and extract the file extension only. Let's say the extension is ***'jpg'***.
* Then later join this extension to the movie name and create a filename like ***movie.jpg***
* Use the Python command `open` to open a file and write the poster data. Close the file after done.
* This function may not return anything. It just saves the poster data as an image file.

In [140]:
# Write your code here

#os already imported for directory mapping 

def save_poster(json_payload):
    title = json_payload['Title']
    poster = json_payload['Poster']
    #Taking the last value in value after period 
    #https://m.media-amazon.com/images/M/MV5BMDdmZGU3NDQtY2E5My00ZTliLWIzOTUtMTY4ZGI1YjdiNjk3XkEyXkFqcGdeQXVyNTA4NzY1MzY@._V1_SX300.jpg
    #takes jpeg
    poster_extension = poster.split('.')[-1]
    
    poster_data = urllib.request.urlopen(poster).read()
     
    
    #saving folder in current directory which is mapped to weeks9_10 folder
    saveloc = os.getcwd()+'\\'+'SavedPosters'+'\\'
    
    #checking to see if folder is alreadedy created
    if not os.path.isdir(saveloc):
        os.mkdir(saveloc )
        
        
        
    #Saveing into the savesposter folder 
    filename=saveloc+str(title)+'.'+poster_extension
    f=open(filename,'wb')
    f.write(poster_data)
    f.close()
    
    
    
    

# Write a utility function `search_movie` to search a movie by its name, print the downloaded JSON data (use the `print_json` function for this) and save the movie poster in the local folder (use `save_poster` function for this)

* Use `try-except` loop for this i.e. try to connect to the web portal, if successful proceed but if not (i.e. exception raised) then just print an error message
* Here use the previously created variables `serviceurl` and `apikey`
* You have to pass on a dictionary with a key `t` and the movie name as the corresponding value to `urllib.parse.urlencode()` function and then add the `serviceurl` and `apikey` to the output of the function to construct the full URL
* This URL will be used for accessing the data
* The JSON data has a key called `Response`. If it is `True`, that means the read was successful. Check this before processing the data. If not successful, then print the JSON key `Error`, which will contain the appropriate error message returned by the movie database.

In [121]:
# Write your code here



def search_movie(title):
    try:
        #intailizing url
        url = 'http://www.omdbapi.com/?t='+title+'&apikey=f43047ff'
        print("Retreiving title {0}".format(title))
        ur_request = urllib.request.urlopen(url)
        data = ur_request.read()
        json_payload=json.loads(data)
        
        #checking that we get a response
        
        if json_payload['Response']=='True':
            print("--------Saving {0} protocal-----".format(title))
            
            #making sure value is not null
            if json_payload['Poster'] is not None:
                print("--------Title returned Value-----")
                print_json(json_payload)
                
                save_poster(json_data)
                     
        else:
            print("Error empty value returened")
    
    #catching all exceptions
    except Exception as e:
        print(e.args)



### Test `search_movie` function by entering *Titanic*

In [139]:
# Write your code here

search_movie("Titanic")

Retreiving title Titanic
--------Saving Titanic protocal-----
--------Title returned Value-----
Title: Titanic
Year: 1997
Rated: PG-13
Released: 19 Dec 1997
Runtime: 194 min
Genre: Drama, Romance
Director: James Cameron
Writer: James Cameron
Actors: Leonardo DiCaprio, Kate Winslet, Billy Zane
Plot: A seventeen-year-old aristocrat falls in love with a kind but poor artist aboard the luxurious, ill-fated R.M.S. Titanic.
Language: English, Swedish, Italian, French
Country: United States, Mexico
Awards: Won 11 Oscars. 125 wins & 83 nominations total
Ratings: [{'Source': 'Internet Movie Database', 'Value': '7.8/10'}, {'Source': 'Rotten Tomatoes', 'Value': '89%'}, {'Source': 'Metacritic', 'Value': '75/100'}]
Metascore: 75
imdbRating: 7.8
imdbVotes: 1,117,107
imdbID: tt0120338


### Test `search_movie` function by entering "*Random_error*" (obviously this will not be found and you should be able to check whether your error catching code is working properly)

In [108]:
# Write your code here

search_movie("Random_error")

Retreiving title Random_error
Error empty value recieved


### Look for a folder called 'Posters' in the same directory you are working in. It should contain a file called 'Titanic.jpg'. Open and see if the poster came alright!