# Higher Diploma in Science in Computing (Data Analytics)  

**Module:** Computer Infrastructure  
**Lecturer:** Ian McLoughlin  
**Author:** Elaine R. Cazetta  

---

# Project: FAANG Stock Data Analysis with yfinance  
This notebook demonstrates how to download, store, and visualize FAANG stock data using Python and the yfinance package. It also shows how to structure a reusable function and automate plotting for further analysis.

---

## ðŸ”¹Problem 1 â€“ Data from yfinance
---

### - Requirements:

Using the [yfinance](https://github.com/ranaroussi/yfinance) Python package, write a function called `get_data()` that downloads all hourly data for the previous five days for the five FAANG stocks:

- Facebook (META)
- Apple (AAPL)
- Amazon (AMZN)
- Netflix (NFLX)
- Google (GOOG)

The function should save the data into a folder called `data` in the root of your repository using a filename with the format `YYYYMMDD-HHmmss.csv` where `YYYYMMDD` is the four-digit year (e.g. 2025), followed by the two-digit month (e.g. `09` for September), followed by the two digit day, and `HHmmss` is hour, minutes, seconds.
Create the `data` folder if you don't already have one.

### - Overview of the Solution:  

The following steps show how to import the necessary libraries, download hourly FAANG stock data for the past five days using the yfinance package, and save it to a timestamped CSV file inside the `data` folder.

### - Implementation:

In [1]:
# Import libraries

# Data Frames
import pandas as pd

# Yahoo Finance data
import yfinance as yf

# Dates and Times
import datetime as dt

In [2]:
# List of FAANG tickers
tickers = yf.Tickers('META AAPL AMZN NFLX GOOG')

In [3]:
# Download FAANG stocks data and assign it to a dataframe:
df = yf.download(['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG'], period='5d', interval='1h')
df.head(10) # show the first 10 rows

  df = yf.download(['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG'], period='5d', interval='1h')
[*********************100%***********************]  5 of 5 completed


Price,Close,Close,Close,Close,Close,High,High,High,High,High,...,Open,Open,Open,Open,Open,Volume,Volume,Volume,Volume,Volume
Ticker,AAPL,AMZN,GOOG,META,NFLX,AAPL,AMZN,GOOG,META,NFLX,...,AAPL,AMZN,GOOG,META,NFLX,AAPL,AMZN,GOOG,META,NFLX
Datetime,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2025-11-03 14:30:00+00:00,267.630005,255.339996,283.109985,648.506592,1099.681763,270.779999,258.600006,283.350006,659.329895,1133.5,...,270.420013,255.639999,282.470001,656.0,1133.099976,10223517,26213930,4971493,6069693,989317
2025-11-03 15:30:00+00:00,266.640015,255.460007,282.940002,649.875,1080.439941,268.279999,256.859985,283.880005,652.580017,1103.480957,...,267.640015,255.339996,283.079987,648.505005,1099.532471,3102546,8062082,1577709,2263349,785387
2025-11-03 16:30:00+00:00,267.535004,256.010315,282.839996,650.169983,1089.849976,268.149994,256.440002,284.100006,653.0,1089.98999,...,266.659912,255.455002,282.970001,649.849976,1079.959961,2821699,5726412,1006362,1443661,595383
2025-11-03 17:30:00+00:00,267.678009,256.51001,282.709991,644.140015,1093.52002,268.0,257.170013,283.679993,650.929993,1094.670044,...,267.519989,256.026215,282.75,650.169983,1089.704956,2034109,4362482,897827,2972799,310840
2025-11-03 18:30:00+00:00,267.309998,255.065002,283.299988,644.460022,1095.800049,268.619995,256.584991,283.875,646.0,1096.839966,...,267.7099,256.51001,282.704987,644.05011,1093.809937,2075268,3903743,1014491,2232166,261430
2025-11-03 19:30:00+00:00,267.619995,254.869995,285.109985,642.369995,1099.25,267.790009,256.029999,285.940002,646.280029,1099.699951,...,267.320007,255.065002,283.299988,644.460022,1095.52002,2243434,4161677,1353842,1730255,238988
2025-11-03 20:30:00+00:00,269.059998,254.059998,284.25,637.719971,1099.569946,269.109985,255.279999,285.167999,642.76001,1100.300049,...,267.630005,254.869995,285.140015,642.369995,1099.23999,3383138,4190135,1168232,2642789,281103
2025-11-04 14:30:00+00:00,268.429993,253.389999,280.269989,635.355774,1098.313965,269.589996,255.440002,281.075012,641.739929,1104.599854,...,268.242493,250.380005,277.070007,628.039978,1099.285034,9645778,10799520,3778209,5424088,626099
2025-11-04 15:30:00+00:00,270.609985,252.440094,279.376099,631.97998,1092.555054,271.0,253.565002,281.833588,638.799988,1103.5,...,268.429993,253.369995,280.22049,635.27002,1098.209961,3278354,4292114,1285824,2766300,300006
2025-11-04 16:30:00+00:00,270.01001,250.955002,278.25,631.544983,1095.069946,271.485992,252.535095,279.549988,633.450012,1096.719971,...,270.640015,252.429993,279.230011,631.992004,1092.359985,3846826,2839567,1038904,1369778,226854


In [None]:
# Save dataframe to CSV
# Reference: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html
df.to_csv('data/data.csv')

### - The `Data` Directory:  
This step uses Pythonâ€™s built-in `os` module to create a folder named `data`. This directory will store the CSV files downloaded from Yahoo Finance. The `os.makedirs()` function is used with the `exist_ok=True` argument to ensure the folder is created if it doesnâ€™t already exist, avoiding any errors if itâ€™s run multiple times.

In [None]:
# Create 'data' folder if it doesn't exist
# Ref: https://docs.python.org/3/library/os.html
# Ref: OpenAI
import os

os.makedirs("data", exist_ok=True)

### - Dates and Times:   
In this section, Iâ€™ll generate a timestamp for naming the CSV file using Pythonâ€™s `datetime` module [(reference: official documentation)](https://docs.python.org/3/library/datetime.html). This ensures that each dataset is saved with a unique and descriptive filename.

In [5]:
# Get the current date and time using the datetime module
now = dt.datetime.now()

# Display the current date and time
now

datetime.datetime(2025, 11, 9, 16, 52, 16, 380946)

In [6]:
# Format the current date and time as a string: YYYYMMDD-HHmmss
# This format will be used in the filename
now.strftime("%Y%m%d-%H%M%S")

'20251109-165216'

In [7]:
# Create a unique filename that includes the timestamp
# The file will be saved inside the 'data' folder as a CSV file
filename = "data/" + dt.datetime.now().strftime("%Y%m%d-%H%M%S") + ".csv"
print(filename)

data/20251109-165216.csv


In [8]:
# Save the downloaded dataframe (df) to the CSV file
# This will store the FAANG stock data in the 'data' folder
# For reference: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html
df.to_csv(filename)

### - Complete Function: `get_data()`    

Below is the final version of the `get_data()` function that combines all the steps demonstrated earlier. This function downloads hourly stock data for the five FAANG companies for the previous five days, creates a timestamped filename using the `datetime` module, saves the dataset as a CSV file inside the `data` folder, and ensures that the folder exists before saving.

In [None]:
# Script of the `get_data()` function

def get_data():
# Reference: 
# https://www.w3schools.com/python/python_functions.asp
# https://docs.python.org/3/tutorial/controlflow.html#defining-functions
# OpenAI
    """
    Downloads hourly stock data for the previous five days 
    for the FAANG companies and saves it as a timestamped CSV file.
    """
    
    # Import libraries inside the function
    import yfinance as yf
    import datetime as dt
    import os
  
    # Download hourly data for the last 5 days of FAANG tickers
    data = yf.download(['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG'], period='5d', interval='1h')

    # Create 'data' folder if it doesn't exist
    # Ref: https://docs.python.org/3/library/os.html
    # Ref: OpenAI
    os.makedirs("data", exist_ok=True)

    # Generate filename with timestamp
    filename = "data/" + dt.datetime.now().strftime("%Y%m%d-%H%M%S") + ".csv"

    # Save data to CSV
    data.to_csv(filename)

    # Print confirmation message
    print(f"Data saved to {filename}")

    # Return the downloaded DataFrame
    return data

# Example usage
df = get_data()
df.head(3)

  data = yf.download(['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG'], period='5d', interval='1h')
[*********************100%***********************]  5 of 5 completed

Data saved to data/20251109-171854.csv





Price,Close,Close,Close,Close,Close,High,High,High,High,High,...,Open,Open,Open,Open,Open,Volume,Volume,Volume,Volume,Volume
Ticker,AAPL,AMZN,GOOG,META,NFLX,AAPL,AMZN,GOOG,META,NFLX,...,AAPL,AMZN,GOOG,META,NFLX,AAPL,AMZN,GOOG,META,NFLX
Datetime,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2025-11-03 14:30:00+00:00,267.630005,255.339996,283.109985,648.506592,1099.681763,270.779999,258.600006,283.350006,659.329895,1133.5,...,270.420013,255.639999,282.470001,656.0,1133.099976,10223517,26213930,4971493,6069693,989317
2025-11-03 15:30:00+00:00,266.640015,255.460007,282.940002,649.875,1080.439941,268.279999,256.859985,283.880005,652.580017,1103.480957,...,267.640015,255.339996,283.079987,648.505005,1099.532471,3102546,8062082,1577709,2263349,785387
2025-11-03 16:30:00+00:00,267.535004,256.010315,282.839996,650.169983,1089.849976,268.149994,256.440002,284.100006,653.0,1089.98999,...,266.659912,255.455002,282.970001,649.849976,1079.959961,2821699,5726412,1006362,1443661,595383


---

## ðŸ”¹Problem 2: Plotting Data  
---

### - Requirements:  

Write a function called `plot_data()` that opens the latest data file in the `data` folder and, on one plot, plots the `Close` prices for each of the five stocks.
The plot should include axis labels, a legend, and the date as a title.
The function should save the plot into a `plots` folder in the root of your repository using a filename in the format `YYYYMMDD-HHmmss.png`.
Create the `plots` folder if you don't already have one.

### - Approach:  
TBD


### - Implementation:

In [None]:
# Script of the `plot_data()` function
#
# References:
# https://matplotlib.org/stable/gallery/index.html
# https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
# https://docs.python.org/3/library/os.html
# https://docs.python.org/3/library/glob.html
# OpenAI

def plot_data():
    """
    Opens the latest CSV file in the 'data' folder and plots the Close prices
    for the FAANG companies on one chart. The plot includes axis labels,
    a legend, and the current date as a title. It saves the plot as a PNG file
    in the 'plots' folder with a timestamped filename.
    """

    # Import libraries inside the function
    import pandas as pd
    import matplotlib.pyplot as plt
    import datetime as dt
    import os
    import glob

    # Create 'plots' folder if it doesn't exist
    # Ref: https://docs.python.org/3/library/os.html
    # Ref: OpenAI
    os.makedirs("plots", exist_ok=True)

    # Find the latest CSV file in the 'data' folder
    list_of_files = glob.glob("data/*.csv")
    if not list_of_files:
        print("No CSV files found in the 'data' folder.")
        return
    latest_file = max(list_of_files, key=os.path.getctime)

    # Read the latest data file
    df = pd.read_csv(latest_file, header=[0, 1], index_col=0)

    # Plot Close prices for each FAANG stock
    plt.figure(figsize=(12, 6))

    # Handle MultiIndex columns (like ('AAPL', 'Close'))
    for ticker in ['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG']:
        if (ticker, 'Close') in df.columns:
            plt.plot(df.index, df[(ticker, 'Close')], label=ticker)

    # Add labels, title, and legend
    plt.xlabel("Date and Time")
    plt.ylabel("Close Price (USD)")
    plt.title(f"FAANG Stock Close Prices - {dt.datetime.now().strftime('%Y-%m-%d')}")
    plt.legend()

    # Generate filename with timestamp
    filename = "plots/" + dt.datetime.now().strftime("%Y%m%d-%H%M%S") + ".png"

    # Save the plot
    plt.savefig(filename)
    plt.close()
    
    # Print confirmation message
    print(f"Plot saved to {filename}")


---

## ðŸ”¹Problem 3: Script  
---

### - Requirements:  

Create a Python script called `faang.py` in the root of your repository.
Copy the above functions into it and it so that whenever someone at the terminal types `./faang.py`, the script runs, downloading the data and creating the plot.
Note that this will require a shebang line and the script to be marked executable.
Explain the steps you took in your notebook.

---

## ðŸ”¹Problem 4: Automation  
---

### - Requirements:  

Create a [GitHub Actions workflow](https://docs.github.com/en/actions) to run your script every Saturday morning.
The script should be called `faang.yml` in a `.github/workflows/` folder in the root of your repository.
In your notebook, explain each of the individual lines in your workflow.

---

## End