# **How can we analyze, predict, and compare economic performance across U.S. states?**

Economic disparities across U.S. states have significant implications for policymakers, businesses, and researchers. By exploring metrics such as GDP, personal income, and consumer spending, we can uncover historical trends, predict future economic performance, and compare states to identify disparities and areas for improvement.


### **Project Overview**
This project provides an interactive Flask-based dashboard to explore, analyze, and predict economic metrics for U.S. states using a real dataset. It offers:
1. **Exploratory Data Analysis (EDA):** Visualize economic trends over time.
2. **Predictive Analysis:** Predict future economic performance using linear regression.
3. **Correlation Analysis:** Highlight relationships between economic metrics using heatmaps.
4. **Comparative Analysis:** Compare economic metrics between two states for a specific year.
5. **Filter and Explore Data:** View raw data for transparency.


1. [Project Goals](#project-goals)
2. [Code Explanation](#code-explanation)
3. [How Features Fulfill Requirements](#how-features-fulfill-requirements)
4. [Installation and Usage](#installation-and-usage)

### **Project Goals**
- Explore economic trends for U.S. states using visualizations.
- Predict future economic performance with machine learning.
- Highlight relationships between economic metrics to uncover insights.
- Allow users to filter and compare data for better understanding and transparency.

---

### **Code Explanation**


### **Data Loading and Cleaning**


In [None]:
file_path = "Table.csv"
data = pd.read_csv(file_path, skiprows=3)
data.replace("(NA)", None, inplace=True)
data.iloc[:, 4:] = data.iloc[:, 4:].apply(pd.to_numeric, errors="coerce")
data.dropna(axis=1, how="all", inplace=True)
data.dropna(axis=0, how="all", inplace=True)
headers_to_remove = ["Real dollar statistics", "Current dollar statistics (millions of dollars)", ...]
data = data[~data["Description"].isin(headers_to_remove)]
data.to_csv("cleaned_data.csv", index=False)
cleaned_data = pd.read_csv("cleaned_data.csv")


Purpose: Clean and preprocess the dataset to remove irrelevant rows/columns and handle missing values.

Relevance: Ensures the data is ready for analysis and visualization

### **Dataset Filtering**


In [None]:
@app.route("/filter")
def index():
    state = request.args.get("state")
    year = request.args.get("year")
    filtered_data = cleaned_data
    if state:
        filtered_data = filtered_data[filtered_data["GeoName"].str.contains(state, case=False, na=False)]
    if year:
        filtered_data = filtered_data[["GeoFips", "GeoName", "Description", year_column]]
    subset_data = filtered_data.drop(columns=["GeoFips"], errors="ignore")
    return render_template("base.html", columns=subset_data.columns, data=subset_data.values.tolist())


Purpose: Allows users to filter data by state and year.

Relevance: Provides raw data exploration functionality for transparency.

### **Exploratory Data Analysis(EDA)**


In [None]:
@app.route("/eda", methods=["GET", "POST"])
def eda():
    selected_state = request.form.get("state", "United States")
    selected_metric = request.form.get("metric", "Real GDP")
    filtered_data = data[(data["GeoName"] == selected_state) & ...]
    ...
    plt.plot(values.index, values.values, marker="o")
    ...
    return render_template("eda.html", plot_url=img)


Purpose: Visualize trends over time for a selected state and metric.

Relevance: Highlights historical trends, outliers, and growth periods.

### **Predictive Analysis**


In [None]:
@app.route("/predict", methods=["GET", "POST"])
def predict():
    filtered_data = data[(data["GeoName"] == selected_state) & ...]
    ...
    model = LinearRegression()
    model.fit(X, y)
    predictions = model.predict(future_years)
    ...
    return render_template("predict.html", plot_url=img, table=prediction_table.to_dict("records"))


Purpose: Uses linear regression to predict future economic trends based on different variables.

Relevance: Provides foresight into economic performance, helping with planning and policy-making.

### **Correlation Heatmap**


** In hindsight, I've realized my correlation heatmap doesn't work properly as I designed but it still shows the heatmap visually and the code itself does display my knowledge on the topic in terms of the project. Just wanted to show this aspect as well if there's intrest. **

In [None]:
@app.route("/heatmap", methods=["GET", "POST"])
def heatmap():
    filtered_data = data[data["GeoName"] == selected_state]
    ...
    sns.heatmap(corr, annot=True, cmap="coolwarm", fmt=".2f")
    ...
    return render_template("heatmap.html", plot_url=img)


Purpose: Displays correlations between economic metrics.

Relevance: Highlights relationships and dependencies between metrics.

### **Comparative Analysis**


In [None]:
@app.route("/compare", methods=["GET", "POST"])
def compare():
    data1 = data[(data["GeoName"] == state1) & ...]
    ...
    plt.bar(data1["Description"], data1[year], label=state1)
    ...
    return render_template("compare.html", plot_url=img)


Purpose: Compares metrics for two states in different years. 

Relevance: Identifies disparities and economic strengths/weaknesses based on the region.

#### **Quick Summary**

**Filter Dataset** Allows for raw data exploration and transparency.

**EDA**	Visualizes trends to understand historical economic performance.

**Predictive Analysis**	Uses machine learning to show future trends.

**Correlation Heatmap**	Highlights relationships between variables, aiding in feature selection and better understanding of the U.S. economy. 

**Comparative Analysis** Compares performance between regions, helping to identify economic disparities and areas that can be improved. 

______

### **Conclusion** 

This project gives an analysis as to the U.S.'s state in terms of economic data and offers insights to relationships between key metrics such as GDP (Gross Domestic Product ), employment, and consumer spending over time. By using the visualizations and predictive modeling, the dashboard uncovers several important patterns and relationships such as :

**Strong correlations between Gross Domestric Product and Employment**

Reveals that states with higher GDP levels have stronger employment metrics. Economic growth is tied to job creation which highlights the importance of Gross Domestic Product. 

**Consumer Spending as a Key Economic Indicator**

Correlations between consumer spending and GDP demosntrate that increased spending will often indiciate higher income levels. This relationship is apparent in states with diversified economies. 

**Economic Disparities Between States**

Comparative analysis highlights the significant disparities in economic performances across different states. Extreme examples of this are shown in California and New York where they consistently outperform smaller or less industrialized states in metrics like per capita income and GDP. 







To sum it up this project successfully highlights the value of data-driven insights in understanding regional economic performance. By combining historical trends, predictive modeling and comparative analysis, we now have a useful tool to help uncover economic patterns that may have future use. Relationships such as the ones between GDP and employment offer a vital role in terms of consumer spending which can be applied to any business. In the future, it has more potential to be enhanced as we can add mroe additional datasets such as demographic factors, trends such as housing and product based which can provide further insight in terms of analytics. In the end, this project displays how data science helps show complex economic landscapes and supports decision making. 