# Assessment Problems

## Problem 1: Data from yfinance


https://github.com/ranaroussi/yfinance


In [49]:
# Dates and times.
import datetime as dt

# Data frames.
import pandas as pd

# Operating system.
import os

# Yahoo finance data.
import yfinance as yf


In [50]:
# Tickers:
 # A list of stock symbols used to find data from yfinance

# Get data : 
# The get_data function enables retrieval of pricing snapshots, as well as fundamental and reference data, in a single call.
# See: https://cdn.refinitiv.com/public/rd-lib-python-doc/1.0.0.0/book/en/sections/access-layer/access/get-data-function.html
# period an interval used to obtain historical data

# Download data:
# This function uses the yfinance Python library to download historical stock data.
# See: https://medium.com/%40anjalivemuri97/day-4-fetching-historical-stock-data-with-yfinance-f45f3bd8b9c6
# I use auto_adjust=True, to omit the future warning
# See: https://github.com/ranaroussi/yfinance/blob/0713d9386769b168926d3959efd8310b56a33096/yfinance/utils.py#L445-L462

# DataFrame:
# It’s widely used for data analysis, cleaning, and visualization.Supports filtering, sorting, aggregation, and analysis
# See: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html

In [51]:
# Get historical data for multiple tickers at once:
tickers = ["META", "AAPL", "AMZN", "NFLX", "GOOGL"]

# Get data:
def get_data(tickers, period="5d", interval="1h"): 
    data = yf.download(tickers, period=period, interval=interval, group_by='ticker', auto_adjust=True) 
    return data
df=get_data(tickers,period="5d", interval="1d")

[*********************100%***********************]  5 of 5 completed


In [52]:
# Saving data into csv file:
# See: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html 

# Date time:
# Used to record the exact date and time
# See: https://docs.python.org/3/library/datetime.html

In [53]:
from datetime import datetime

def save_data(df):
    folder = "data"
    os.makedirs(folder, exist_ok=True)

    # Generate timestamp filename
    timestamp = datetime.now().strftime("%Y%m%d-%H%M%S") 
    filename = f"{timestamp}.csv"

    # Full path
    filepath = os.path.join(folder, filename)

    # Save dataframe
    df.to_csv(filepath, index=False)

    print(f"Saved file: {filepath}")
    return filepath
save_data(df)

Saved file: data\20251209-112528.csv


'data\\20251209-112528.csv'

## Problem 2: Plotting Data

In [54]:
import datetime as dt
import matplotlib.pyplot as plt
import os
import matplotlib
matplotlib.use('Agg')  # Anti-Grain Geometry, used to save graphicals into files not display on screen

In [55]:
# Display first few rows of the DataFrame
df.head()

Ticker,META,META,META,META,META,NFLX,NFLX,NFLX,NFLX,NFLX,...,GOOGL,GOOGL,GOOGL,GOOGL,GOOGL,AAPL,AAPL,AAPL,AAPL,AAPL
Price,Open,High,Low,Close,Volume,Open,High,Low,Close,Volume,...,Open,High,Low,Close,Volume,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2025-12-02,642.340027,647.869995,638.070007,647.099976,11640900,109.209999,109.730003,107.519997,109.349998,25763000,...,316.532931,318.171873,313.704794,315.603546,35854700,283.0,287.399994,282.630005,286.190002,53669500
2025-12-03,644.409973,648.849976,637.549988,639.599976,11134300,106.589996,106.870003,102.029999,103.959999,53593400,...,315.683536,321.369789,313.894697,319.421082,41838300,286.200012,288.619995,283.299988,284.149994,43538700
2025-12-04,676.0,676.099976,660.049988,661.530029,29874600,103.57,103.800003,101.769997,103.220001,51779100,...,322.019387,322.149276,314.49431,317.412384,31240900,284.100006,284.730011,278.589996,280.700012,43989100
2025-12-05,664.0,674.690002,662.390015,673.419983,21207900,98.779999,104.790001,97.739998,100.239998,133363600,...,319.281132,322.948746,318.961364,321.059967,28851700,280.540009,281.140015,278.049988,278.779999,47265800
2025-12-08,669.340027,676.710022,665.070007,666.799988,13091600,99.870003,99.889999,95.300003,96.790001,100206800,...,320.049988,320.440002,311.220001,313.720001,33835800,278.130005,279.670013,276.149994,277.890015,38140000


In [56]:
df.columns

MultiIndex([( 'META',   'Open'),
            ( 'META',   'High'),
            ( 'META',    'Low'),
            ( 'META',  'Close'),
            ( 'META', 'Volume'),
            ( 'NFLX',   'Open'),
            ( 'NFLX',   'High'),
            ( 'NFLX',    'Low'),
            ( 'NFLX',  'Close'),
            ( 'NFLX', 'Volume'),
            ( 'AMZN',   'Open'),
            ( 'AMZN',   'High'),
            ( 'AMZN',    'Low'),
            ( 'AMZN',  'Close'),
            ( 'AMZN', 'Volume'),
            ('GOOGL',   'Open'),
            ('GOOGL',   'High'),
            ('GOOGL',    'Low'),
            ('GOOGL',  'Close'),
            ('GOOGL', 'Volume'),
            ( 'AAPL',   'Open'),
            ( 'AAPL',   'High'),
            ( 'AAPL',    'Low'),
            ( 'AAPL',  'Close'),
            ( 'AAPL', 'Volume')],
           names=['Ticker', 'Price'])

In [57]:
df.index    

DatetimeIndex(['2025-12-02', '2025-12-03', '2025-12-04', '2025-12-05',
               '2025-12-08', '2025-12-09'],
              dtype='datetime64[ns]', name='Date', freq=None)

In [58]:
# Plotting stock closing prices
df[[('AMZN', 'Close'),
    ('META', 'Close'),
    ('GOOGL', 'Close'),
    ('AAPL', 'Close'),
    ('NFLX', 'Close')]].plot(figsize=(12,6))


<Axes: xlabel='Date'>

In [None]:
def plot_data():
    data_folder = "data"
    plots_folder = "plots"

    # Ensure plots folder exists
    os.makedirs(plots_folder, exist_ok=True)

    # Get list of files in data folder
    files = [os.path.join(data_folder, f) for f in os.listdir(data_folder)]

    # Find latest file by modification time
    latest_file = max(files, key=os.path.getmtime)

    print(f"Opening latest data file: {latest_file}")

    # Read file into DataFrame (supports CSV only here)
    if latest_file.endswith(".csv"):
        df = pd.read_csv(latest_file, header=[0, 1], index_col=0)
    else:
        print("Unsupported file type!")
        return

In [63]:
# Ensure df has MultiIndex columns like ('AAPL', 'Close')
tickers = ["AAPL", "AMZN", "META", "NFLX", "GOOGL"]

plt.figure(figsize=(12, 6))

for ticker in tickers:
        try:
            plt.plot(df.index, df[(ticker, "Close")], label=ticker)
        except KeyError:
            print(f"Close price for {ticker} not found")

# Use date from file name (optional improvement)
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")

plt.title(f"Closing Prices - {timestamp[:8]}")  # YYYYMMDD
plt.xlabel("Date")
plt.ylabel("Closing Price (USD)")
plt.legend()
plt.tight_layout()


In [None]:
 # Save plot
data_folder = "data"
plots_folder = "plots"
plot_path = os.path.join(plots_folder, f"{timestamp}.png")
plt.savefig(plot_path, dpi=300)
plt.close()

print(f"Saved plot: {plot_path}")

if __name__ == "__main__":
    plot_data()


Saved plot: plots\20251209-112707.png
Opening latest data file: data\20251209-112528.csv


## Problem 3: Script

In [None]:
#! /usr/bin/env python

# Dates and time
import datetime as dt

#Yahoo Finance data
import yfinance as yf

# Get data
df= yf.download(["META", "AAPL", "AMZN", "NFLX", "GOOGL"], period="5d", interval="1h", auto_adjust=True)

# Current data and time
now=dt.datetime.now()

# File name
filename="data/" + now.strftime("%Y%m%d-%H%M%S") + ".csv"

# Save data as CSV file
df.to_csv(filename)

[*********************100%***********************]  5 of 5 completed


## Problem 4: Automation

### Explanation of my workflow 

#### Workflow Name
- name: Weekly FAANG Script Run

This is the name that will appear in the GitHub Actions tab. It helps identify which workflow is running.

####   Run Label
- run-name: ${{ github.actor }} created a FAANG workflow run

This sets a dynamic label visible inside workflow history.
${{ github.actor }} prints the username of whoever triggered the workflow manually or by commit.

####  Triggers ([on:])
- on:

This section controls when GitHub starts the workflow.

#### Scheduled runs     
- 0 9 * * SAT     - Run every Saturday at 09:00 UTC

To Runs automatically every week according to cron syntax: minute hour dayOfMonth month dayOfWeek

####     Manual Trigger
- workflow_dispatch:

This allows you to click a Run workflow button in GitHub → Actions tab.
Useful for testing without waiting until next Saturday.

#### Jobs Section
- jobs:
  run_faang_script:

Jobs = tasks that must run on GitHub’s machine.

#### Machine Configuration
- runs-on: ubuntu-latest

GitHub allocates a clean cloud machine with:
✔ Linux OS
✔ Python preinstalled
✔ Git tools
✔ Permissions to run workflows

GitHub uses Linux even if my laptop runs Windows, because are faster setup, cheaper to run, standard environment, Windows runners take longer and are less stable

### Steps Inside the Job

#### Checkout repository
- name: Checkout repository
  uses: actions/checkout@v6

To downloads  GitHub repo into the runner and gives access to files (e.g. faang.py, requirements.txt)
Without it, the runner has nothing to execute.

#### Configure Python
- name: Set up Python
  uses: actions/setup-python@v5
  with:
    python-version: '3.12'

This ensures that it uses the correct Python version, is in an isolated environment, and is not affected by the GitHub system Python.

#### Install dependencies
- name: Install dependencies
  run: |
    python -m pip install --upgrade pip
    pip install -r requirements.txt

Explanation:
First upgrades pip → avoids dependency errors , and then installs everything required for the script.

This includes libraries like:

✔ yfinance
✔ pandas
✔ matplotlib

#### Run Python script
- name: Run FAANG script
  run: |
    python faang.py

This executes script exactly like clicking Run locally.the workflow produces results automatically every week if script: downloads stocks, generates plots, saves output into /plots/

### Final Summary 
The workflow : 

- operates automatically at 9:00 UTC on Saturdays.

- can be started manually

- generates a clean Linux environment

- installs the necessary dependencies

- runs faang.py

This ensures that we can receive a fresh financial review every week without open laptop.


## END