# Higher Diploma in Science in Computing (Data Analytics)  

**University**: [Atlantic Technological University (ATU)](https://www.atu.ie/)  
**Module:** Computer Infrastructure  
**Lecturer:** [Ian McLoughlin](https://github.com/ianmcloughlin)  
**Author:** [Elaine R. Cazetta](https://github.com/elainecazetta)  

---

# Project: FAANG Stock Data Analysis with yfinance  
This notebook demonstrates how to download, store, and visualize FAANG stock data using Python and the yfinance package. It also shows how to structure a reusable function and automate plotting for further analysis.

---

## ðŸ”¹Problem 1 â€“ Data from yfinance

---

### - Requirements:

Using the [yfinance](https://github.com/ranaroussi/yfinance) Python package, write a function called `get_data()` that downloads all hourly data for the previous five days for the five FAANG stocks:

- Facebook (META)
- Apple (AAPL)
- Amazon (AMZN)
- Netflix (NFLX)
- Google (GOOG)

The function should save the data into a folder called `data` in the root of your repository using a filename with the format `YYYYMMDD-HHmmss.csv` where `YYYYMMDD` is the four-digit year (e.g. 2025), followed by the two-digit month (e.g. `09` for September), followed by the two digit day, and `HHmmss` is hour, minutes, seconds.
Create the `data` folder if you don't already have one.

---

### - Overview of the Solution:  

The following steps show how to import the necessary libraries, download hourly FAANG stock data for the past five days using the yfinance package, and save it to a timestamped CSV file inside the `data` folder.

In [None]:
# Import libraries

# Data Frames
import pandas as pd

# Yahoo Finance data
import yfinance as yf

# Dates and Times
import datetime as dt

In [None]:
# List of FAANG tickers
tickers = yf.Tickers('META AAPL AMZN NFLX GOOG')

In [None]:
# Download FAANG stocks data and assign it to a dataframe:
df = yf.download(['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG'], period='5d', interval='1h')
df.head(3) # show the first 3 rows

### - The `Data` Directory:  
This step uses Pythonâ€™s built-in `os` module to create a folder named `data`. This directory will store the CSV files downloaded from Yahoo Finance. The `os.makedirs()` function is used with the `exist_ok=True` argument to ensure the folder is created if it doesnâ€™t already exist, avoiding any errors if itâ€™s run multiple times.

In [None]:
# Import the os module and create 'data' folder if it doesn't exist
# Reference: https://docs.python.org/3/library/os.html
# Reference: OpenAI
import os

os.makedirs("data", exist_ok=True)

In [None]:
# Save dataframe to CSV
# Reference: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html
df.to_csv('data/data.csv')

### - Dates and Times:   
To create unique filenames, a timestamp is generated using Pythonâ€™s `datetime` module [(reference: official documentation)](https://docs.python.org/3/library/datetime.html). This guarantees that each dataset is saved with a distinctive and descriptive name.

In [None]:
# Get the current date and time using the datetime module
now = dt.datetime.now()

# Display the current date and time
now

In [None]:
# Format the current date and time as a string: YYYYMMDD-HHmmss
# This format will be used in the filename
now.strftime("%Y%m%d-%H%M%S")

In [None]:
# Create a unique filename that includes the timestamp
# The file will be saved inside the 'data' folder as a CSV file
filename = "data/" + dt.datetime.now().strftime("%Y%m%d-%H%M%S") + ".csv"
print(filename)

In [None]:
# Save the downloaded dataframe (df) to the CSV file
# This will store the FAANG stock data in the 'data' folder
# Reference: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html
df.to_csv(filename)

### - Complete Function: `get_data()`    

Below is the final version of the `get_data()` function that combines all the steps demonstrated earlier. This function downloads hourly stock data for the five FAANG companies for the previous five days, creates a timestamped filename using the `datetime` module, saves the dataset as a CSV file inside the `data` folder, and ensures that the folder exists before saving.

In [None]:
# Script of the `get_data()` function

def get_data():
# References: 
# https://www.w3schools.com/python/python_functions.asp
# https://docs.python.org/3/tutorial/controlflow.html#defining-functions
# OpenAI
    """
    Downloads hourly stock data for the previous five days 
    for the FAANG companies and saves it as a timestamped CSV file.
    """
    
    # Import libraries inside the function
    import yfinance as yf
    import datetime as dt
    import os
  
    # Download hourly data for the last 5 days of FAANG tickers
    data = yf.download(['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG'], period='5d', interval='1h')

    # Create 'data' folder if it doesn't exist
    # Ref: https://docs.python.org/3/library/os.html
    # Ref: OpenAI
    os.makedirs("data", exist_ok=True)

    # Generate filename with timestamp
    filename = "data/" + dt.datetime.now().strftime("%Y%m%d-%H%M%S") + ".csv"

    # Save data to CSV
    data.to_csv(filename)

    # Print confirmation message
    print(f"Data saved to {filename}")

    # Return the downloaded DataFrame
    return data

# Example usage
df = get_data()
df.head(3)

---

## ðŸ”¹Problem 2: Plotting Data  

---

### - Requirements:  

Write a function called `plot_data()` that opens the latest data file in the `data` folder and, on one plot, plots the `Close` prices for each of the five stocks.
The plot should include axis labels, a legend, and the date as a title.
The function should save the plot into a `plots` folder in the root of your repository using a filename in the format `YYYYMMDD-HHmmss.png`.
Create the `plots` folder if you don't already have one.

---

### - Solution:  

In [None]:
# Import libraries

# Data Frames
import pandas as pd

# Plotting
import matplotlib.pyplot as plt

# Dates and Times
import datetime as dt

# Handle folders
import os

# Find matching files
import glob

### - The `Plots` Directory:  
To store the generated plots, Pythonâ€™s built-in `os` module is used to create a folder named `plots`. The `os.makedirs()` function with `exist_ok=True` ensures that the folder is created if it does not already exist, preventing any errors if the code is run multiple times. This prepares a dedicated location for saving all plot images.


In [None]:
# Create 'plots' folder if it doesn't exist
# Reference: https://docs.python.org/3/library/os.html
# Reference: OpenAI
os.makedirs("plots", exist_ok=True)

### - Finding the Latest CSV File  

To work with the most recent dataset, we search the `data` folder for all CSV files using Pythonâ€™s `glob` module. Each fileâ€™s creation time is checked with `os.path.getctime`, and the newest file is selected. If no CSV files exist, the function exits to avoid errors.

In [None]:
# Find the most recent CSV file in the 'data' folder
# References: 
# https://docs.python.org/3/library/glob.html
# OpenAI

# The 'glob' module searches for all CSV files in the folder
list_of_files = glob.glob("data/*.csv")

In [None]:
# 'os.path.getctime' returns the creation time of the file; 
latest_file = max(list_of_files, key=os.path.getctime, default=None) # Get the most recently created file

In [None]:
# Exit if no files are found; otherwise print the latest file
if latest_file is None:
    print("No CSV files found in the 'data' folder.")
    return
else:
    print(f"Using latest file: {latest_file}")

In [None]:
# Read the most recent CSV file into a DataFrame
# header=[0, 1] handles multi-level columns; index_col=0 sets the first column as the index
df = pd.read_csv(latest_file, header=[0, 1], index_col=0)

### - Plotting FAANG Stock Data

In this section, we visualize the `Close` prices for the five FAANG stocks from the most recent CSV file. The plot includes axis labels, a legend, and the current date as the title. The figure is then saved as a timestamped PNG file in the `plots` folder.

In [None]:
# Initialize a new figure for plotting
# Set the figure size for better readability
plt.figure(figsize=(12, 6))

In [None]:
# Loop through each FAANG ticker and plot its Close price
# Handle MultiIndex columns, e.g., ('AAPL', 'Close')
for ticker in ['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG']:
    if (ticker, 'Close') in df.columns:
        plt.plot(df.index, df[(ticker, 'Close')], label=ticker)

In [None]:
# Add axis labels
plt.xlabel("Date and Time")
plt.ylabel("Close Price (USD)")

# Add a title with the current date
plt.title(f"FAANG Stock Close Prices - {dt.datetime.now().strftime('%Y-%m-%d')}")

# Add a legend to distinguish each stock
plt.legend()

In [None]:
# Generate a timestamped filename for the plot
filename = "plots/" + dt.datetime.now().strftime("%Y%m%d-%H%M%S") + ".png"

# Save the figure as a PNG file in the 'plots' folder
plt.savefig(filename)

### - Complete Function: `plot_data()`    

Below is the final version of the `plot_data()` function that combines all the steps demonstrated earlier. This function opens the most recent CSV file from the `data` folder, plots the `Close` prices for the five FAANG stocks on a single chart with axis labels, a legend, and the current date as the title. The plot is then saved as a timestamped PNG file in the `plots` folder, and the folder is created automatically if it doesnâ€™t already exist.

In [None]:
# Script of the `plot_data()` function
#
# References:
# https://matplotlib.org/stable/gallery/index.html
# https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
# https://docs.python.org/3/library/os.html
# https://docs.python.org/3/library/glob.html
# OpenAI

def plot_data():
    """
    Opens the latest CSV file in the 'data' folder and plots the Close prices
    for the FAANG companies on one chart. The plot includes axis labels,
    a legend, and the current date as a title. It saves the plot as a PNG file
    in the 'plots' folder with a timestamped filename.
    """

    # Import libraries inside the function
    import pandas as pd
    import matplotlib.pyplot as plt
    import datetime as dt
    import os
    import glob

    # Create 'plots' folder if it doesn't exist
    # Ref: https://docs.python.org/3/library/os.html
    # Ref: OpenAI
    os.makedirs("plots", exist_ok=True)

    # Find the latest CSV file in the 'data' folder
    list_of_files = glob.glob("data/*.csv")
    latest_file = max(list_of_files, key=os.path.getctime, default=None)

    # Handle case where no files are found
    if latest_file is None:
        print("No CSV files found in the 'data' folder.")
        return  # Exit the function if no files exist
    else:
        print(f"Using latest file: {latest_file}")

    # Read the latest data file
    df = pd.read_csv(latest_file, header=[0, 1], index_col=0)

    # Plot Close prices for each FAANG stock
    plt.figure(figsize=(12, 6))

    # Handle MultiIndex columns (like ('AAPL', 'Close'))
    for ticker in ['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG']:
        if (ticker, 'Close') in df.columns:
            plt.plot(df.index, df[(ticker, 'Close')], label=ticker)

    # Add labels, title, and legend
    plt.xlabel("Date and Time")
    plt.ylabel("Close Price (USD)")
    plt.title(f"FAANG Stock Close Prices - {dt.datetime.now().strftime('%Y-%m-%d')}")
    plt.legend()

    # Generate filename with timestamp
    filename = "plots/" + dt.datetime.now().strftime("%Y%m%d-%H%M%S") + ".png"

    # Save the plot
    plt.savefig(filename)
    plt.close()
    
    # Print confirmation message
    print(f"Plot saved to {filename}")


---

## ðŸ”¹Problem 3: Script  

---

### - Requirements:  

Create a Python script called `faang.py` in the root of your repository.
Copy the above functions into it and it so that whenever someone at the terminal types `./faang.py`, the script runs, downloading the data and creating the plot.
Note that this will require a shebang line and the script to be marked executable.
Explain the steps you took in your notebook.

---

TBD

---

## ðŸ”¹Problem 4: Automation  

---

### - Requirements:  

Create a [GitHub Actions workflow](https://docs.github.com/en/actions) to run your script every Saturday morning.
The script should be called `faang.yml` in a `.github/workflows/` folder in the root of your repository.
In your notebook, explain each of the individual lines in your workflow.

---

TBD

---

## End