<div align='center'><font size="5" color='#353B47'>Scrap Stock Price</font></div>
<div align='center'><font size="4" color="#353B47">CAC40</font></div>
<br>
<hr>

**<font color="#5963ab" size="4">Context</font>**

> This notebook is designed to demonstrate a concise script for scraping CAC40 stock prices utilizing yfinance library. You can retrieve the scraped data <a href="https://www.kaggle.com/datasets/bryanb/cac40-stocks-dataset">here</a>

<h1 style="color:white;background:#5963ab;border-radius:5px;padding:30px;font-family:'Arial', cursive;font-size:50px;text-align:center">Imports</h1>

In [2]:
#!pip install yfinance

In [3]:
import yfinance as yf
import datetime
import csv
import pandas as pd

<h1 style="color:white;background:#5963ab;border-radius:5px;padding:30px;font-family:'Arial', cursive;font-size:50px;text-align:center">Settings</h1>

In [4]:
# Define date range and list Stocks, Last three year
start_date = datetime.datetime.now() - datetime.timedelta(days=365*3)
end_date = datetime.datetime.now()

# List of CAC40 stock symbols
cac40_stocks = [
    'AC.PA',    # Accor
    'AI.PA',    # Air Liquide
    'AIR.PA',   # Airbus
    'ALO.PA',   # Alstom
    'MT.PA',    # ArcelorMittal
    'CS.PA',    # AXA
    'BN.PA',    # Danone
    'EN.PA',    # Bouygues
    'CAP.PA',   # Capgemini
    'CA.PA',    # Carrefour
    'ACA.PA',   # Crédit Agricole
    'BNP.PA',   # BNP Paribas
    'ENGI.PA',  # ENGIE
    'EL.PA',    # EssilorLuxottica
    'RMS.PA',   # Hermès
    'OR.PA',    # L'Oréal
    'LR.PA',    # Legrand
    'MC.PA',    # LVMH
    'ML.PA',    # Michelin
    'ORA.PA',   # Orange
    'RI.PA',    # Pernod Ricard
    'UG.PA',    # Peugeot
    'PUB.PA',   # Publicis Groupe
    'RNO.PA',   # Renault
    'SAF.PA',   # Safran
    'SGO.PA',   # Saint-Gobain
    'SAN.PA',   # Sanofi
    'SU.PA',    # Schneider Electric
    'GLE.PA',   # Société Générale
    'SW.PA',    # Sodexo
    'STM.PA',   # STMicroelectronics
    'HO.PA',    # Thales
    'FP.PA',    # TotalEnergies
    'ATO.PA',   # Atos
    'VIE.PA',   # Veolia
    'DG.PA',    # Vinci
    'VIV.PA',   # Vivendi
    'WLN.PA',   # Worldline
    'KER.PA',   # Kering
    'FR.PA'     # Valeo
]

<h1 style="color:white;background:#5963ab;border-radius:5px;padding:30px;font-family:'Arial', cursive;font-size:50px;text-align:center">Download</h1>

In [5]:
# Function to download stock data
def download_stock_data(stock, start_date, end_date):
    """
    Download historical stock data from Yahoo Finance for a given stock symbol and date range.

    :param stock: The stock symbol (ticker) for which to download the data, e.g., 'AAPL' for Apple Inc.
    :type stock: str
    :param start_date: The start date for the historical data range, as a datetime object.
    :type start_date: datetime.datetime
    :param end_date: The end date for the historical data range, as a datetime object.
    :type end_date: datetime.datetime
    :return: A pandas DataFrame containing historical stock data for the specified stock and date range.
             The DataFrame has the following columns: 'Date', 'Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume', 'Stock'.
    :rtype: pandas.DataFrame
    """
    # Download the historical stock data for the given stock symbol and date range
    stock_data = yf.download(stock, start=start_date, end=end_date)

    # Reset the DataFrame index to make 'Date' a column instead of the index
    stock_data.reset_index(inplace=True)

    # Add a 'Stock' column to the DataFrame and set its value to the stock symbol
    stock_data['Stock'] = stock

    return stock_data

In [6]:
def aggregate_stocks(stocks, output_path, start_date, end_date):
    """
    Download historical stock data for a list of stock symbols and save the data in a CSV file.

    :param stocks: A list of stock symbols (tickers) for which to download historical data.
    :type stocks: list of str
    :param output_path: The file path where the aggregated stock data will be saved as a CSV file.
    :type output_path: str
    :param start_date: The start date for the historical data range, as a datetime object.
    :type start_date: datetime.datetime
    :param end_date: The end date for the historical data range, as a datetime object.
    :type end_date: datetime.datetime
    :return: A pandas DataFrame containing historical stock data for the specified stocks and date range.
             The DataFrame has the following columns: 'Date', 'Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume', 'Stock'.
    :rtype: pandas.DataFrame
    """
    # Initialize an empty list to store the DataFrames for each stock
    all_stock_data = []

    # Download stock data for each stock symbol in the list
    for stock in stocks:
        stock_data = download_stock_data(stock, start_date, end_date)
        all_stock_data.append(stock_data)

    # Concatenate the DataFrames for each stock into a single DataFrame
    all_stock_data_df = pd.concat(all_stock_data, ignore_index=True)

    # Save the aggregated stock data to a CSV file
    all_stock_data_df.to_csv(output_path, index=False)
    
    return all_stock_data_df

In [7]:
all_stock_data_df = aggregate_stocks(
    stocks = cac40_stocks, 
    output_path = 'cac40_stock_data.csv',
    start_date=start_date,
    end_date=end_date
)
all_stock_data_df.head()

[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed

1 Failed download:
['MT.PA']: Exception('%ticker%: No price data found, symbol may be delisted (1d 2021-05-17 16:35:46.997772 -> 2024-05-16 16:35:46.997772)')
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[***********

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume,Stock
0,2021-05-17,32.419998,32.5,31.76,31.790001,31.460169,389760.0,AC.PA
1,2021-05-18,32.060001,32.209999,31.6,32.130001,31.796642,578073.0,AC.PA
2,2021-05-19,31.700001,31.790001,30.959999,31.52,31.19297,765741.0,AC.PA
3,2021-05-20,31.799999,31.82,31.07,31.33,31.004942,604361.0,AC.PA
4,2021-05-21,31.4,31.98,31.27,31.629999,31.301828,474191.0,AC.PA


<hr>
<br>
<div align='justify'><font color="#353B47" size="4">I appreciate your time and effort in reading this notebook. My aim is to address your questions and curiosities in a comprehensive and clear manner. I welcome any constructive feedback that will help me improve and motivate me to deliver higher quality content. My primary goal is to share knowledge and learn from others while fueling my passion for the subject. If you found this notebook valuable, please consider upvoting and sharing my work. </font></div>
<br>
<div align='center'><font color="#353B47" size="3">Thank you, and let passion be your guide.</font></div>