My section of the work:

Content Writing:
- Write the Introduction (topic and tasks).
- Write the Data Description (size, source, and attributes).
- Draft the Summary of Findings section.

Interactive Visualizations:
- Create two interactive visualizations:
    - Line chart with user-selectable time ranges using D3.js.
    - Scatter plot with selectable axes using Plotly.

Webpage Integration:
- Embed interactive visualizations into the webpage.
- Ensure all content and visualizations are integrated smoothly.
- Review and refine the final webpage for consistency and accuracy.


In [8]:
import pandas as pd
import glob

# Define the path to your CSV files
path = "project_csv/*.csv"

# Load all CSVs into a single DataFrame
files = glob.glob(path)
dataframes = []
for file in files:
    company_name = file.split('/')[-1].split('.')[0]  # Extract company name from the file name
    df = pd.read_csv(file)
    df['Company'] = company_name  # Add a column for the company name
    dataframes.append(df)

# Combine all data into one DataFrame
merged_data = pd.concat(dataframes, ignore_index=True)

# Process data for the line chart (keep Date, Adj Close, and Company)
line_chart_data = merged_data[['Date', 'Adj Close', 'Company']].copy()
line_chart_data['Date'] = pd.to_datetime(line_chart_data['Date'])  # Convert Date to datetime


# Save line chart data for later use
line_chart_data.to_csv('line_chart_data.csv', index=False)

# Process data for the scatter plot (aggregate by Company)
scatter_plot_data = merged_data.groupby('Company').agg(
    high_mean=('High', 'mean'),
    low_mean=('Low', 'mean'),
    adj_close_mean=('Adj Close', 'mean'),
    volume_mean=('Volume', 'mean')
).reset_index()

# Save scatter plot data for later use
scatter_plot_data.to_csv('scatter_plot_data.csv', index=False)

# Display results for verification
print("Line Chart Data Sample:")
print(line_chart_data.head())

print("\nScatter Plot Data Sample:")
print(scatter_plot_data.head())


Line Chart Data Sample:
        Date   Adj Close Company
0 2019-03-14  114.750084   ADDYY
1 2019-03-15  114.807297   ADDYY
2 2019-03-18  112.595238   ADDYY
3 2019-03-19  113.558250   ADDYY
4 2019-03-20  114.340088   ADDYY

Scatter Plot Data Sample:
  Company   high_mean    low_mean  adj_close_mean   volume_mean
0    AAPL  129.990979  127.185805      127.104573  9.895392e+07
1   ADDYY  127.761262  125.719907      123.766634  7.990883e+04
2     AMD   85.775064   82.339802       84.101785  6.627018e+07
3    GOOG  103.284349  101.059955      102.195139  2.885142e+07
4    INTC   47.339016   46.140111       43.418522  3.339969e+07
