# **Assessment Problems**

## Data from yfinance

Using the yfinance Python package, write a function called get_data() that downloads all hourly data for the previous five days for the five FAANG stocks: Facebook (META) Apple (AAPL) Amazon (AMZN) Netflix (NFLX) Google (GOOG) The function should save the data into a folder called data in the root of your repository using a filename with the format YYYYMMDD-HHmmss.csv where YYYYMMDD is the four-digit year (e.g. 2025), followed by the two-digit month (e.g. 09 for September), followed by the two digit day, and HHmmss is hour, minutes, seconds.

Imports

In [1]:
import yfinance as yf # Yahoo Finance data.
import pandas as pd # Pandas library
import os
import datetime as dt
import matplotlib.pyplot as plt

In [None]:


def get_data(): #Defining function

    tickers = ['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG']

    stocks_data = {}
    #List with the five stock tickers
    
    # Looping through tickers
    for ticker in tickers:

        
        
        # Fetching data with custom interval, hourly data, previous five days
        df = yf.download(ticker, period='5d', interval='1h', auto_adjust = False)
       # print(df) checking how data if printed

        # Converting datetime from an index to a column, for better visualization
        df.reset_index(inplace=True)
        #print(df) checking how data is printed after setting datetime to a column


        stocks_data[ticker] = df
                 
    # Current time
    now = dt.datetime.now()

    all_data = pd.concat(stocks_data.values(), axis=1)
    # Creting CSV file with required naming format and current time
    all_data.to_csv("data/" + now.strftime("%Y%m%d-%H%M%S") + ".csv", index=False) 
    
# Run the function
get_data()

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed


https://github.com/ranaroussi/yfinance
https://www.geeksforgeeks.org/python/getting-stock-symbols-with-yfinance-in-python/ #Fetching multiple tickers with list
https://medium.com/@kasperjuunge/yfinance-10-ways-to-get-stock-data-with-python-6677f49e8282 #how to download periodic data
https://www.geeksforgeeks.org/python/getting-stock-symbols-with-yfinance-in-python/ #Custom interval
https://huggingface.co/Adilbai/stock-trading-rl-agent/blob/27f177526bebcc8cc49daa4cd66566d360feae0d/dataprocessor.py #Inspiration code
https://www.geeksforgeeks.org/python/how-to-create-filename-containing-date-or-time-in-python/ #Creating file with format 
https://github.com/ranaroussi/yfinance/issues/2308 #auto adjust to fix error, downloaded new yfinance version but didnt work
https://medium.com/@shouke.wei/mastering-stock-data-analysis-with-yfinance-in-python-63e91a6c41c2 latest inspiration code for rework

## Plotting Data

Write a function called plot_data() that opens the latest data file in the data folder and, on one plot, plots the Close prices for each of the five stocks. The plot should include axis labels, a legend, and the date as a title. The function should save the plot into a plots folder in the root of your repository using a filename in the format YYYYMMDD-HHmmss.png. Create the plots folder if you don't already have one.

In [None]:
def plot_data():

    path = "data"

    # Find all .csv files in the folder
    csv_files = [x for x in os.listdir(path) if x.endswith(".csv")]

    # Find the most recent CSV file by modification time
    recent_csv = max(csv_files, key=lambda x: os.stat(os.path.join(path, x)).st_mtime)
    
    #Joining folder and file to create the full file path for functions
    latest_path = os.path.join(path, recent_csv)

    # Reading CSV and converting to Dataframe, header has two rows for plotting reference
    df = pd.read_csv(latest_path, header=[0, 1])

    #Converting first column with dates to Datetime
    datetime_column = df.columns[0]
    df[datetime_column] = pd.to_datetime(df[datetime_column])
    
    #Defining tickers to be plotted
    tickers = ['META', 'AAPL', 'AMZN', 'NFLX', 'GOOG']

    plt.figure(figsize=(14, 6)) # making plot wider for better visualization

    # Plot Close prices for each ticker, looping through each ticker 
    for ticker in tickers:
        plt.plot(df[datetime_column], df[('Close', ticker)], label=ticker)
        
    # Defining lables and title to plot
    plt.xlabel("Datetime")
    plt.ylabel("Close Price")
    plt.title('Close prices')
    plt.legend()

      # Saving plot to 'plots' folder
    timestamp = dt.datetime.now().strftime("%Y%m%d-%H%M%S")
    save_path = os.path.join("plots", f"{timestamp}.png")
    plt.savefig(save_path)
    plt.close()
    

plot_data()

check if looking for latest file or not
add pni = 300 to plot

https://stackoverflow.com/questions/58881381/using-python-to-identify-and-load-last-csv-file-in-directory-by-updated-time#:~:text=Open%20the%20directory%20and%20filter,file%20in%20the%20target%20directory. Finding latest CSV file
https://www.geeksforgeeks.org/python/python-os-path-join-method/ - Path.join method
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html 
Plotting library
https://towardsdatascience.com/working-with-multi-index-pandas-dataframes-f64d2e2c3e02/#:~:text=A%20multi%2Dindex%20(also%20known,exciting%20to%20represent%20your%20data Multi level indexing

## Script

Create a Python script called faang.py in the root of your repository. Copy the above functions into it and it so that whenever someone at the terminal types ./faang.py, the script runs, downloading the data and creating the plot. Note that this will require a shebang line and the script to be marked executable. Explain the steps you took in your notebook.

## Automation

Create a GitHub Actions workflow to run your script every Saturday morning. The script should be called faang.yml in a .github/workflows/ folder in the root of your repository. In your notebook, explain each of the individual lines in your workflow.

## End