# Logging, Read me

In this directory, we study logging. Logging is a means of tracking events that happen when some software runs. 
For more details, see https://docs.python.org/3/howto/logging.html#logging-basic-tutorial.


We download the table which contains popular movies in 2021 on the website.
We print 

(1) the list of popular horror movies and 

(2) the average of Tickets Sold.

We also try to print the average of 2021 Gross to run errors. In the data frame, 2021 Gross is an object.

The table is as follows: 

In [1]:
import pandas as pd
import numpy as np
import requests

html_url = "https://www.the-numbers.com/market/2021/top-grossing-movies"
r = requests.get(html_url)
top_movie = pd.read_html(r.text, header=0)
movie = top_movie[0]
movie.set_index('Rank', inplace=True)
movie.head()

Unnamed: 0_level_0,Movie,ReleaseDate,Distributor,Genre,2021 Gross,Tickets Sold
Rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,A Quiet Place: Part II,"May 28, 2021",Paramount Pictures,Horror,"$137,363,444",14996009.0
2,Godzilla vs. Kong,"Mar 31, 2021",Warner Bros.,Action,"$100,392,257",10959853.0
3,F9: The Fast Saga,"Jun 25, 2021",Universal,Action,"$76,637,530",8366542.0
4,Cruella,"May 28, 2021",Walt Disney,Comedy,"$72,033,000",7863864.0
5,The Conjuring: The Devil Ma…,"Jun 4, 2021",Warner Bros.,Horror,"$59,204,511",6463374.0


## Reading HTML tables using Pandas.ipynb

I also analised the above data frame in the file `Reading HTML tables using Pandas.ipynb`.

I watched the tutorial on YouTube
<a href="https://www.youtube.com/watch?v=r-uOLxNrNk8&t=13929s">"Data Analysis with Python - Full Course for Beginners (Numpy, Pandas, Matplotlib, Seaborn)"</a>
with <a href="https://github.com/ine-rmotr-curriculum">some materials on github</a>.

I also watched the turorial on YouTube
<a href="https://www.youtube.com/watch?v=0P7QnIQDBJY">"Python Plotting Tutorial w/ Matplotlib & Pandas (Line Graph, Histogram, Pie Chart, Box & Whiskers)"</a>

## (1) The list of popular horror movies

To practice logging, we print the list of popular horror movies.
Here, the following four files are involved by inheritance.
<ul>
<li>logger.py</li>
<li>fetch_for_practice_logging.py</li>
<li>refinement_for_practice_logging.py</li>
<li>practice_logging1.py</li>
    </ul>
    
To study inheritance, I watched the tutorial 
<a href="https://www.youtube.com/watch?v=z6MCR2O0yak">"Inheritance"</a>
<br> To study logging, I watched the tutorials 
<ul>
<li> 
<a href="https://calmcode.io/logging/introduction.html">"Logging: better hindsight"</a>
</li>
<li>
<a href="https://www.youtube.com/watch?v=p0A4CV4MWd0&list=PLqnslRFeH2UqLwzS0AwKDKLrpYBKzLBy2&index=10&t=11s">"Logging in Python"</a>
</li></ul>

In [None]:
# logger.py

import logging

logger = logging.getLogger(__name__)

# handlers to determine the destination of log messages
stream_handler = logging.StreamHandler() # command line interpreter
file_handler = logging.FileHandler("debug.log") # file

# set levels
logger.setLevel(logging.DEBUG) # all levels
stream_handler.setLevel(logging.WARNING) # only serious three levels on command line interpreter
file_handler.setLevel(logging.DEBUG) # details in the file

# formatters to determine the layouts of log messages
fmt_stream = '%(levelname)s %(asctime)s %(message)s'
fmt_file = '%(levelname)s %(asctime)s [%(filename)s:%(funcName)s:%(lineno)d] %(message)s'

stream_formatter = logging.Formatter(fmt_stream)
file_formatter = logging.Formatter(fmt_file)

# here we hook everything together
stream_handler.setFormatter(stream_formatter)
file_handler.setFormatter(file_formatter)

# add handlers from the logger
logger.addHandler(stream_handler)
logger.addHandler(file_handler)

In [None]:
# fetch_for_practice_logging.py

import pandas as pd
import numpy as np
import requests
from logger import logger

def popular_movie():
    html_url = "https://www.the-numbers.com/market/2021/top-grossing-movies"
    r = requests.get(html_url)
    logger.debug(f"get {html_url}")
    top_movie = pd.read_html(r.text, header=0)
    movie = top_movie[0]
    logger.debug(f"get the table movie from {html_url}")
    movie.set_index('Rank', inplace=True)
    logger.debug("clean the data frame")
    return movie.drop(['Total Gross of All Movies', 'Total Tickets Sold'], axis = 0)

In [None]:
# refinement_for_practice_logging

import pandas as pd
from fetch_for_practice_logging import popular_movie
from logger import logger

def refinement(genre):
    logger.debug("about to download data.")
    movie = popular_movie()
    logger.debug("data is downloaded.")
    genre = movie[(movie['Genre'] == genre)]
    logger.warning("will show the data of the requested genre.")
    return pd.DataFrame(genre['Movie'])

In [None]:
# practice_logging1.py

import sys
from logger import logger

from refinement_for_practice_logging import refinement

if __name__ == "__main__":
    try:
    genre = sys.argv[1]
    logger.warning("Everything is going well.")
    print(f"Popular {sys.argv[1]} movie in 2021 with the total ranks are {refinement(sys.argv[1])}.")
      
    except BaseException:
        logger.error("Error happened!", exc_info=True)

On command line interpreter, I run the following command.

`$ python practice_logging1.py Horror`

The interpreter shows warning, error and critical while the file `debug.log` shows debug, info, warning, error and critical.

<img src="practice_logging(genre).png">

The file `debug.log` has the following log messages.

WARNING 2021-06-29 13:18:57,167 [practice_logging1.py:<module>:8] Everything is going well.
    
DEBUG 2021-06-29 13:18:57,167 [refinement_for_practice_logging.py:refinement:6] is about to download data.
    
WARNING 2021-06-29 13:18:59,336 [fetch_for_practice_logging.py:popular_movie:9] get https://www.the-numbers.com/market/2021/top-grossing-movies  
    
WARNING 2021-06-29 13:18:59,376 [fetch_for_practice_logging.py:popular_movie:12] get the table movie from https://www.the-numbers.com/market/2021/top-grossing-movies
    
WARNING 2021-06-29 13:18:59,377 [fetch_for_practice_logging.py:popular_movie:14] clean the data frame
    
DEBUG 2021-06-29 13:18:59,378 [refinement_for_practice_logging.py:refinement:8] data is downloaded.
    
WARNING 2021-06-29 13:18:59,378 [refinement_for_practice_logging.py:refinement:10] will show the data of the requested genre.

### Logging on Jupyter Notebook
I run a similar program on jupyter notebook instead of the console. It works well.
See the file `practice_logging(genre).ipynb`.

## (2) The average of tickets sold.

To practice logging, we print the average of tickets sold.
Here, the following four files are involved by inheritance. The first two files are the same as the above.
<ul>
<li>logger.py</li>
<li>fetch_for_practice_logging.py</li>
<li>mean_for_practice_logging.py</li>
<li>practice_logging2.py</li>
    </ul>

In [None]:
# mean_for_practice_logging.py

import pandas
from fetch_for_practice_logging import popular_movie
from logger import logger

def get_mean(ticker):
    logger.debug("about to download data.")
    movie = popular_movie()  
    logger.debug("data is downloaded.")
    return movie[ticker].mean()

In [None]:
# practice_logging2.py

import sys
from logger import logger

from mean_for_practice_logging import get_mean

if __name__ == "__main__":
    try:
        ticker = sys.argv[1]
        logger.warning("ready to show the average.")
        print(f"The average {sys.argv[1]} for movie in 2021 is {get_mean(sys.argv[1])}.")
    
    except BaseException:
        logger.error("Error happened!", exc_info=True)

On command line interpreter, I run the following command.

`$ python practice_logging2.py "Tickets Sold"`

To cause errors, I run the following commands.

<code>$ python practice_logging2.py "Gross"</code>

<code>$ python practice_logging2.py "2021 Gross"</code>

All the logging messages are recorded in the file `debug.log`.