### **Introduction**
Our project, *Best Stock Picker*, is focused on analyzing the performance of the 10 most traded stocks from 2019 to 2024 to identify trends, patterns, and correlations that can guide investment decisions. Stocks are a cornerstone of personal finance, yet many people find it challenging to navigate the complexities of the stock market. By leveraging historical data and visualizing key metrics, this project seeks to make stock performance accessible and actionable for investors.

The project aims to:
1. Analyze stock price trends over time to identify growth patterns and periods of high volatility.
2. Compare key metrics, such as average trading volume, price fluctuations, and overall returns, across companies.
3. Create **static and interactive visualizations** to explore stock performance dynamically.
4. Develop user scenarios to assess potential investment strategies for short-term and long-term investors.

For static visualizations, we used **Altair** and **Matplotlib** to generate insightful line charts, bar charts, and heatmaps that highlight key trends and comparisons. For interactive visualizations, we employed **D3.js** for dynamic line charts and **Plotly** for customizable scatter plots. These tools empower users to explore data interactively, tailoring insights to their specific needs and preferences. Together, these visualizations present complex financial data in a user-friendly and visually compelling format, helping both novice and experienced investors make informed decisions.

### **Data Description**
"The dataset for this project comprises historical stock data for the 10 most traded stocks from 2019 to 2024. The data was sourced from Yahoo Finance, a trusted platform for financial information. Each stock file includes daily data with the following attributes:
- **Date**: The trading date.
- **Open**: The opening price of the stock on the given day.
- **High**: The highest price recorded during the trading day.
- **Low**: The lowest price recorded during the trading day.
- **Close**: The closing price of the stock at the end of the trading day.
- **Adj Close**: The adjusted closing price, accounting for corporate actions like dividends and splits.
- **Volume**: The number of shares traded during the day.

The dataset contains approximately 12,500 records, with 1,250 rows per stock. This volume of data ensures robust analysis and visualization opportunities. To enhance the analysis, key metrics such as mean trading volume and average price volatility were calculated for each company. The dataset was preprocessed to remove anomalies and ensure consistency, enabling meaningful comparisons across companies.

### **Summary of Findings**
"Our analysis of the 10 most traded stocks from 2019 to 2024 yielded several insights into stock performance and investment strategies:
1. **Stock Price Trends**:
   - The line chart visualization revealed that certain stocks, such as [Example Stock], demonstrated steady growth, making them suitable for long-term investors.
   - Conversely, stocks like [Example Stock] exhibited high volatility, appealing to short-term traders seeking quick gains.

2. **Key Metrics Comparisons**:
   - The scatter plot analysis highlighted a strong correlation between high trading volume and price stability for certain stocks, indicating consistent market interest.
   - Outliers, such as [Example Stock], showed unusually high volatility despite lower trading volumes, suggesting speculative activity.

3. **Investment Scenarios**:
   - Stocks with stable trends and consistent trading volumes, such as [Example Stock], may be ideal for risk-averse investors focusing on long-term growth.
   - High-volatility stocks with fluctuating prices, such as [Example Stock], offer opportunities for risk-tolerant investors seeking short-term gains.

The interactive visualizations enabled users to explore these findings dynamically, empowering them to tailor their insights based on specific metrics and time periods. Future work could include incorporating additional external factors, such as market events or macroeconomic indicators, to enhance the depth of the analysis.

In [2]:
import pandas as pd
import glob

# Load all CSVs into one DataFrame
files = glob.glob("project_csv/*.csv")
dataframes = []
for file in files:
    company_name = file.split('/')[-1].split('.')[0]  # Extract company name from file name
    df = pd.read_csv(file)
    df['Company'] = company_name  # Add a column for the company name
    dataframes.append(df)

# Combine all data into one DataFrame
merged_data = pd.concat(dataframes, ignore_index=True)

# Convert Date column to datetime
merged_data['Date'] = pd.to_datetime(merged_data['Date'])

# Aggregate data for scatter plot, keeping Date for filtering
aggregated_data = merged_data.groupby(['Company', 'Date']).agg(
    high_mean=('High', 'mean'),
    low_mean=('Low', 'mean'),
    adj_close_mean=('Adj Close', 'mean'),
    volume_mean=('Volume', 'mean')
).reset_index()

# Save the updated dataset with Date for filtering
aggregated_data.to_csv('scatter_plot_data.csv', index=False)


My section of the work:

Content Writing:
- Write the Introduction (topic and tasks).
- Write the Data Description (size, source, and attributes).
- Draft the Summary of Findings section.

Interactive Visualizations:
- Create two interactive visualizations:
    - Line chart with user-selectable time ranges using D3.js.
    - Scatter plot with selectable axes using Plotly.

Webpage Integration:
- Embed interactive visualizations into the webpage.
- Ensure all content and visualizations are integrated smoothly.
- Review and refine the final webpage for consistency and accuracy.
