# Dataset of Datasets

This notebook contains information about various datasets available for analysis.


## Apple Stock (AAPL)

This dataset contains historical stock price data for Apple Inc. (AAPL), collected to track the company's financial performance and market trends over time. The data is essential for understanding Apple's stock behavior and making investment decisions.

[Apple Stock Data](https://rhodyexchange.org/dataset/AAPL.csv)

1. How has Apple's stock price performed over different time periods, and what factors might explain significant price movements?
2. What patterns exist in Apple's trading volume and how do they correlate with major product launches or earnings announcements?


## Google Stock (GOOG)

This dataset contains historical stock price data for Alphabet Inc. (Google), collected to analyze the company's market performance and technological sector trends. The data helps track Google's financial growth and market position in the tech industry.

[Google Stock Data](https://rhodyexchange.org/dataset/GOOG.json)

1. How does Google's stock performance compare to other major tech companies, and what drives its market valuation?
2. What correlation exists between Google's stock price movements and key business developments like acquisitions or regulatory changes?


## Tesla Stock (TSLA)

This dataset contains historical stock price data for Tesla Inc., collected to study the electric vehicle market leader's volatile stock performance and its impact on the automotive and clean energy sectors. The data is crucial for understanding Tesla's unique market dynamics and investor sentiment.

[Tesla Stock Data](https://rhodyexchange.org/dataset/TSLA.csv)

1. How does Tesla's stock volatility compare to traditional automotive companies, and what factors contribute to its high price swings?
2. What relationship exists between Tesla's stock performance and key company milestones like vehicle delivery numbers or energy product launches?


In [None]:
import pandas as pd
import numpy as np
from datasets import datasets

metadata_list = []

for dataset in datasets:
    
    df = dataset['load_function'](dataset['url'])
    
    df.to_csv(f"{dataset['short_name']}.csv", index=False)
    
    num_rows = len(df)
    num_columns = len(df.columns)
    num_numerical = len(df.select_dtypes(include=[np.number]).columns)
    
    metadata_list.append({
        'name': dataset['short_name'],
        'source': dataset['url'],
        'num_rows': num_rows,
        'num_columns': num_columns,
        'num_numerical': num_numerical
    })
    


: 

In [None]:
metadata_df = pd.DataFrame(metadata_list)

print(metadata_df)

print(f"Total datasets: {len(metadata_df)}")
print(f"Total rows across all datasets: {metadata_df['num_rows'].sum()}")
print(f"Average columns per dataset: {metadata_df['num_columns'].mean():.1f}")
print(f"Average numerical columns per dataset: {metadata_df['num_numerical'].mean():.1f}")
