# **Gathering and Storing Holiday Data for Zimbabwe and South Africa**

In [1]:
!pip install --upgrade holidays




[notice] A new release of pip is available: 23.3 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip


**The primary objective is to compile a comprehensive list of public holidays for Zimbabwe (ZW) and South Africa (ZA) for the years 2018 through 2022. This data is crucial for various analyses, such as understanding traffic patterns or economic activities during public holidays. By analyzing traffic or business data in the context of these holidays, we can derive insights that are otherwise not apparent.**

In [2]:
import csv
from datetime import date
import holidays

# Initialize empty dictionaries for Zimbabwe and South Africa
zw_holidays = {}
za_holidays = {}

# Populate the dictionaries if the holiday data is available
try:
    zw_holidays = holidays.ZW(years=[2018, 2019, 2020, 2021, 2022])  # this should be a dict
    za_holidays = holidays.ZA(years=[2018, 2019, 2020, 2021, 2022])  # this should be a dict
except Exception as e:
    print(f"An error occurred: {e}")

# Combine the holiday dictionaries for both countries and keep the holiday name
Combined_Holidays = {}
if isinstance(zw_holidays, dict) and isinstance(za_holidays, dict):
    Combined_Holidays = {date: name for date, name in zw_holidays.items()}
    Combined_Holidays.update({date: name for date, name in za_holidays.items()})

# Function to save holidays to a single CSV file
def save_Combined_Holidays_to_csv(country_holidays, path):
    with open(f"{path}/Combined_Holidays.csv", 'w', newline='') as csvfile:
        csvwriter = csv.writer(csvfile)
        csvwriter.writerow(['Date', 'Holiday'])  # Single 'Holiday' column
        for date, name in sorted(country_holidays.items()):
            csvwriter.writerow([date, name])  # Write Date and Holiday Name

# Save combined holidays
if Combined_Holidays:
    save_Combined_Holidays_to_csv(Combined_Holidays, 'C:/Users/shume/Desktop/6501.81_Capstone Project/Refrences Data Comparision/Additional Data/')


# **Downloading and Processing OPEC Fuel Price Data**

In [3]:
!pip install requests




[notice] A new release of pip is available: 23.3 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip





**The goal of this script is to download, parse, and process fuel price data from the OPEC (Organization of the Petroleum Exporting Countries) website. The data is in XML format and the script converts it into a more accessible CSV format, focusing on data from the years 2018 to 2022.**

In [4]:
import requests
import xml.etree.ElementTree as ET
import pandas as pd

# Step 1: Download XML from URL
file_url = 'https://www.opec.org/basket/basketDayArchives.xml'
response = requests.get(file_url)

if response.status_code != 200:
    print("Failed to download the XML file. Please check the URL.")
else:
    print("XML file downloaded. Proceeding to next steps.")
    
    # Step 2: Parse XML
    try:
        root = ET.fromstring(response.content)
        print("XML is well-formed.")
        
        # Data extraction and CSV conversion steps
        data_list = []
        for child in root:
            data_dict = {
                "Date": child.attrib.get('data', None),
                "Price": child.attrib.get('val', None)
            }
            data_list.append(data_dict)

        df = pd.DataFrame(data_list)
        df['Date'] = pd.to_datetime(df['Date'])
        filtered_df = df[(df['Date'] >= '2018-01-01') & (df['Date'] <= '2022-12-31')]
        filtered_csv_file_path = 'C:\\Users\\shume\\Desktop\\6501.81_Capstone Project\\Refrences Data Comparision\\Additional Data\\OPEC_Fuel_price.csv'
        filtered_df.to_csv(filtered_csv_file_path, index=False)
        print(f"Filtered CSV saved at {filtered_csv_file_path}")

    except ET.ParseError:
        print("XML is not well-formed. Please correct the XML and try again.")


XML file downloaded. Proceeding to next steps.
XML is well-formed.
Filtered CSV saved at C:\Users\shume\Desktop\6501.81_Capstone Project\Refrences Data Comparision\Additional Data\OPEC_Fuel_price.csv


**The OPEC (Organization of the Petroleum Exporting Countries) Basket Price, also known as the OPEC Reference Basket, is a weighted average of oil prices collected from various OPEC member countries. It serves as a significant benchmark for oil prices worldwide and provides a reliable gauge of oil market conditions.**

**The OPEC Basket is comprised of different crude oils from OPEC member countries, each with varying characteristics and prices. The oils included in the basket are selected based on their importance in the global oil market, and the basket is continually updated to reflect changes in production and exports.**

**The OPEC Basket Price is used for:**

**Market Analysis: It serves as a reference point for understanding global oil prices.
Policy Decisions: OPEC and other organizations use it to make informed decisions about oil production quotas.
Economic Forecasting: It is a crucial variable for predicting economic conditions, as fluctuations in oil prices can significantly impact the global economy.**

# **Stock Data Retrieval**

In [5]:
!pip install yfinance




[notice] A new release of pip is available: 23.3 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip


**Here, we integrate economic indicators into a traffic volume estimation model. The stock data from major companies, which serves as a proxy for economic activity, significantly enhances traffic predictions. Generally, higher economic activity correlates with increased traffic volumes and vice versa. By automating the retrieval of this stock data, we can indirectly assess economic conditions, a crucial factor for accurate traffic volume modeling. This approach is particularly useful for traffic and urban development analysis.**

In [6]:
import yfinance as yf
import os  # for path operations

# List of South African stock symbols
stock_symbols = ["AMS.JO", "HAR.JO", "BAW.JO", "SHP.JO", "WHL.JO", "SBK.JO", "FSR.JO"]

# Time range
start_date = "2018-01-01"
end_date = "2022-12-31"

# Directory to save CSV files
save_directory = "C:\\Users\\shume\\Desktop\\6501.81_Capstone Project\\Refrences Data Comparision\\Additional Data\\stock index"

# Check if the directory exists, if not, create it
if not os.path.exists(save_directory):
    os.makedirs(save_directory)

# Fetch and save data for each stock symbol
for symbol in stock_symbols:
    # Fetch historical stock data
    data = yf.download(symbol, start=start_date, end=end_date)
    
    # Save data to CSV file in the specified directory
    csv_filename = os.path.join(save_directory, f"{symbol}_stock_data.csv")
    data.to_csv(csv_filename)
    
    print(f"Saved data for {symbol} to {csv_filename}")


[*********************100%%**********************]  1 of 1 completed
Saved data for AMS.JO to C:\Users\shume\Desktop\6501.81_Capstone Project\Refrences Data Comparision\Additional Data\stock index\AMS.JO_stock_data.csv
[*********************100%%**********************]  1 of 1 completed
Saved data for HAR.JO to C:\Users\shume\Desktop\6501.81_Capstone Project\Refrences Data Comparision\Additional Data\stock index\HAR.JO_stock_data.csv
[*********************100%%**********************]  1 of 1 completed
Saved data for BAW.JO to C:\Users\shume\Desktop\6501.81_Capstone Project\Refrences Data Comparision\Additional Data\stock index\BAW.JO_stock_data.csv
[*********************100%%**********************]  1 of 1 completed
Saved data for SHP.JO to C:\Users\shume\Desktop\6501.81_Capstone Project\Refrences Data Comparision\Additional Data\stock index\SHP.JO_stock_data.csv
[*********************100%%**********************]  1 of 1 completed
Saved data for WHL.JO to C:\Users\shume\Desktop\6501.81

**Incorporating stock market data into your traffic estimation model can offer a multi-dimensional perspective on factors affecting traffic flow. Here's how these stocks could be relevant:**

**Economic Indicators: Companies like Anglo American Platinum (AMS.JO) and Harmony Gold Mining (HAR.JO) are closely tied to commodity markets. Their stock prices can reflect broader economic conditions, which in turn could influence cross-border traffic for trade.**

**Consumer Behavior: Retail-oriented stocks like Shoprite Holdings (SHP.JO) and Woolworths Holdings (WHL.JO) could provide insights into consumer behavior. Increased stock prices might correlate with increased consumer activity, potentially affecting traffic patterns, especially during holidays or weekends.**

**Financial Climate: Banking stocks like Standard Bank Group (SBK.JO) and FirstRand Ltd (FSR.JO) might reflect the overall financial climate. For example, a booming stock could indicate a healthy economy, potentially driving up cross-border business and, consequently, traffic.**

**Industrial Activity: Barloworld (BAW.JO), being an industrial brand management company, can give insights into industrial activity levels. Increased stock prices might indicate increased industrial activity, possibly affecting cargo traffic.**

**Correlation with Traffic Metrics: As we've seen in the correlation analysis, some of these stocks show moderate correlation with various traffic metrics, indicating that they could be relevant features for the model.**

**Risk Diversification: Using a variety of stocks from different sectors can help diversify the feature set, making the model more robust to changes in any single sector.**