# Assessment Problems: Computer Infrastructure

## Imports

In [2]:
# Dates and times
import datetime as dt
# https://docs.python.org/3/library/datetime.html

# Numerical computing
import numpy as np
# https://numpy.org/doc/2.3/user/absolute_beginners.html

# Stock data from Yahoo Finance
import yfinance as yf
# https://pypi.org/project/yfinance/

# Dataframes
import pandas as pd
# https://pandas.pydata.org/docs/user_guide/index.html

# Data directory
from pathlib import Path 
# https://docs.python.org/3/library/pathlib.html

# Plotting
import matplotlib.pyplot as plt
# https://matplotlib.org/3.5.3/api/_as_gen/matplotlib.pyplot.html

## Problem 1: Data from yfinance

*yfinance documentation:* https://ranaroussi.github.io/yfinance/

For this problem, the following tasks will be completed:
- Write a function called `get_data()`
- This function downloads all hourly data from 5 companies from the past 5 days 
- The 5 companies are the 5 FAANG stocks: Facebook (META), Apple (AAPL), Amazon (AMZN), Netflix (NFLX), & Google (GOOG) *(See: [Investopedia: FAANG Stocks](https://www.investopedia.com/terms/f/faang-stocks.asp))*
- The function will save the data into the folder `data` and each filename will be in the following format: `YYYYMMDD-HHmmss.csv`

First, I used the `yf.Tickers` function to identify the stocks I needed.

In [2]:
# Identify the stocks I want to download
tickers = yf.Tickers('META AAPL AMZN NFLX GOOG')

I had a look on https://algotrading101.com/learn/yfinance-guide/ to get a better idea of how to get the data and how to ask for a 5 day period with 1 hour intervals.

In [3]:
# Download the data
df = yf.download(['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG'], period='5d', interval='1h')

  df = yf.download(['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG'], period='5d', interval='1h')
[*********************100%***********************]  5 of 5 completed


Then I saved the dataframe to a CSV file *(See: [Ian McLoughlin, Lecture videos Week 5](https://atlantictu-my.sharepoint.com/personal/ian_mcloughlin_atu_ie/_layouts/15/stream.aspx?id=%2Fpersonal%2Fian%5Fmcloughlin%5Fatu%5Fie%2FDocuments%2Fstudent%5Fshares%2Fcomputer%2Dinfrastructure%2F21%2Dsaving%2Ddata%2Emkv&referrer=StreamWebApp%2EWeb&referrerScenario=AddressBarCopied%2Eview%2E71983e3b%2D2c6e%2D4ed7%2D8a29%2Db55be8918d26)).*

I use the [`date.strftime`](https://docs.python.org/3.6/library/datetime.html#datetime.date.strftime) method to format the filename as `YYYYMMDD-HHmmss.png`.

In [4]:
# COMMENTED OUT - it saves an extra CSV to the root)
# Current datetime
#now = dt.datetime.now()

# Filename
#filename = now.strftime("%Y%m%d-%H%M%S") + ".csv"

# Save to a CSV file with the current date and time as the filename
#df.to_csv(filename)

Now to put it all together into a function. 

I wasn't sure how to save it into the `data` folder so I checked the Pandas [`.to_csv()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html) documentation.

I found out that I needed to import `pathlib` or `os` to use a data directory.

In [5]:
# Function (putting it all together)

def get_data():
    # Download the data
    df = yf.download(['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG'],
                     period='5d', interval='1h')

    # Build folder and filename
    data_folder = Path().resolve() / "data"
    data_folder.mkdir(parents=True, exist_ok=True)

    filename = dt.datetime.now().strftime("%Y%m%d-%H%M%S") + ".csv"
    filepath = data_folder / filename

    # Save the file
    df.to_csv(filepath)

    return filepath

## Problem 2: Plotting Data

The main objectives for this problem are the following:
 - Write a function called `plot_data()` 
 - This function opens the latest data file in the `data` folder and, on one plot, plots the Close prices for each of the five stocks. 
 - The function should save the plot into the `plots` folder in the format YYYYMMDD-HHmmss.png.

The below code is adapted from [Copilot](https://copilot.microsoft.com/shares/9EGGcjhzuPrid1sVUni2k).

In [None]:
def plot_data():
    # Locate latest CSV in data folder
    data_folder = Path().resolve() / "data"
    latest_file = max(data_folder.glob("*.csv"), key=lambda x: x.stat().st_mtime)
    df = pd.read_csv(latest_file, header=[0,1], index_col=0, parse_dates=True)

    # Create a new figure
    plt.figure(figsize=(10,6))

    # Plot closing prices for all stocks
    df['Close'].plot()

    # Add title and labels
    plt.title("FAANG Stock Prices")
    plt.xlabel("Date")
    plt.ylabel("Price (USD)")
    plt.legend(title="Stocks")

    # Ensure plots folder exists
    plots_folder = Path().resolve() / "plots"
    plots_folder.mkdir(parents=True, exist_ok=True)

    # Filename with timestamp format YYYYMMDD-HHmmss.png
    fig_filename = dt.datetime.now().strftime("%YMMDD-%H%M%S") + ".png"
    fig_path = plots_folder / fig_filename

    # Save figure
    plt.savefig(fig_path)
    plt.close()

    return fig_path

## Problem 3: Script

In this section I must:
- Create a Python script called faang.py in the root of my repository. 
- Copy the above functions into it and make it so that whenever someone at the terminal types ./faang.py, the script runs, downloading the data and creating the plot. 

First I created the `faang.py` file in the root of my repository.

Then I wrote a shebang line at the start of the program to tell my computer that I want Python to execute the program when it is being executed. *(See: [Medium.com: A Deeper View into the Shebang](https://medium.com/@jcroyoaun/a-deeper-view-into-the-shebang-for-linux-scripting-4a26395df49d))*

In [7]:
#!/usr/bin/env python3

Then I imported the necessary modules.

In [8]:
"""
import yfinance as yf
import pandas as pd
import matplotlib.pyplot as plt
from pathlib import Path
import datetime as dt
"""

'\nimport yfinance as yf\nimport pandas as pd\nimport matplotlib.pyplot as plt\nfrom pathlib import Path\nimport datetime as dt\n'

After that, it was time to paste in the code for my functions `get_data()` and `plot_data()`. I took this code straight from the above cells in Problem 1 and 2.

I then created a function `main()` to give me a sanity check for when I run the other functions.

Then I used the `if __name__ == "__main__":` line to make my file accessible as a standalone script.

[ChatGPT](https://chatgpt.com/share/692ef5f0-bdf0-800c-8570-4d2775819ff9) helped me come up with these lines of code.

In [9]:
# Script 
def main():
    csv_path = get_data()
    print(f"Saved data to: {csv_path}")

    plot_path = plot_data()
    print(f"Saved plot to: {plot_path}")

# Ensure script runs when called directly
if __name__ == "__main__":
    main()

  df = yf.download(['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG'],
[*********************100%***********************]  5 of 5 completed


Saved data to: C:\Users\ZMH\OneDrive\Desktop\COMPINFRASTRUCTURE\computer-infrastructure-assessment\data\20251216-132042.csv
Saved plot to: C:\Users\ZMH\OneDrive\Desktop\COMPINFRASTRUCTURE\computer-infrastructure-assessment\plots\2025MMDD-132042.png


<Figure size 1000x600 with 0 Axes>

## Problem 4: Automation

In this problem, I will complete the following tasks:

- Create a GitHub Actions workflow (called `faang.yml`) to run my script every Saturday morning at 9am.

- Explain the code from `faang.yml` below.

I have adapted the code and some explanations below from [Copilot](https://copilot.microsoft.com/shares/ZT4wuksEYTXXx4yMrqVid).

First, I gave the workflow a name - FAANG Workflow. 

Then, I gave it a run name, i.e. the message that appears on GitHub when you run the workflow. I adapted this from the [demo code](https://docs.github.com/en/actions/get-started/quickstart) provided by GitHub.

In [None]:
"""
name: FAANG Workflow

run-name: ${{ github.actor }} is running FAANG script.
"""

Next, I defined when the workflow should run using `on:`.

`workflow_dispatch` allows the user to manually trigger the workflow, which will be helpful for when I want to test my code.

I scheduled the automated runs using `cron` syntax. `'0 8 * * 6'` means **"at 08:00 UTC every Saturday"**.

In [None]:
"""
# Trigger: every Saturday at 8:00 AM Irish time
on:
  workflow_dispatch: # allows manual triggering
  schedule:
    - cron: '0 8 * * 6'   # 8:00 UTC = 9:00 Irish time during summer
"""

Now I have to allow the workflow to make changes to my repository. 

Using `permissions` and `contents: write`, I gave the workflow permission to push changes back into my repo.

In [None]:
"""
permissions:
  contents: write
"""

I have defined one job in this workflow, named `run-faang`. 

It will run on the latest Ubuntu runner environment.

In [None]:
"""
jobs:
  run-faang:
    runs-on: ubuntu-latest
"""

The first step in my this job was to checkout the repository - this section of code pulls my repository into the runner.

`persist-credentials: true` ensures the GitHub Actions runner keeps the authentication token from actions/checkout.

In [None]:
"""
steps:
      - name: Checkout repository
        uses: actions/checkout@v4
        with:
          persist-credentials: true
"""

Then next bit of code sets up Python to run my scripts, as the scripts in my repository are written in Python.

In [None]:
"""
- name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
"""

The next step is to install the dependencies. I have all imported libraries that I used written in a file called `requirments.txt` which makes for easy installation.

In [None]:
"""
- name: Install dependencies
        run: |
          pip install -r requirements.txt
"""

The next step was to run the script `faang.py`. This script downloads the last 5 days of hourly stocks data from the FAANG companies using [Yahoo Finance](https://finance.yahoo.com/markets/stocks/most-active/) and saves a plot of their recent closing prices. 

Because I have the shebang line at the top of my program, I tried to write this code without having to use the `python` command before the program name. This didn't work due to the fact that I'm on Windows but Mac/Linux users should be able to execute the program by writing `./faang.py`.

In [None]:
"""
- name: Run faang.py
        run: |
          python faang.py
"""

The final step is to commit and push the changes back to the repository. 

`git config` sets my identity for commits. I used my GitHub username and a noreply email (safe default).

`git add` Stages the new CSV and PNG files.

`git commit` commits the new files with an automated message.

`git push origin main` pushes to main.

In [None]:
"""
- name: Commit and push generated files 
        run: | 
          git config --global user.name "zoeharlowe" 
          git config --global user.email "zoeharlowe@users.noreply.github.com" 
          git add data/*.csv plots/*.png 
          git commit -m "Automated update: FAANG data and plots" 
          git push origin main
"""

## End