In [0]:
SELECT * FROM ets_thriventcohort.default.county_forecast_combined


In [0]:
SELECT *
FROM ets_thriventcohort.default.county_forecast_combined
ORDER BY AGI_change_raw DESC
LIMIT 10;

In [0]:
SELECT *
FROM ets_thriventcohort.default.county_forecast_combined
WHERE LOWER(location) LIKE '%minnesota%'
ORDER BY pop_2027_predicted DESC


In [0]:
WITH growth AS (
  SELECT
    Location,
    AGI_2027_predicted,
    pop_2027_predicted,
    AGI_per_capita_2027,
    pop_change_raw AS Pop_Growth,
    pop_change_pct AS Pop_Growth_Perc,
    AGI_change_raw AS AGI_Growth,
    AGI_change_pct AS AGI_Growth_Perc
  FROM ets_thriventcohort.default.county_forecast_combined
  WHERE pop_change_raw > 0 AND AGI_change_raw > 0 and pop_2027_predicted > 250000 and AGI_2027_predicted > 20000000 and AGI_per_capita_2027 > 50 AND Location NOT LIKE '%Minnesota%'
),

ranked AS (
  SELECT *,
    RANK() OVER (ORDER BY Pop_Growth_Perc DESC) AS Pop_Rank,
    RANK() OVER (ORDER BY AGI_Growth_Perc DESC) AS AGI_Rank
  FROM growth
)

SELECT *,
  (Pop_Rank + AGI_Rank) AS Composite_Score
FROM ranked
ORDER BY Composite_Score ASC
limit 10;


- ### Summary of Analysis

- The primary focus of this analysis was identifying optimal counties for future bank branch expansion in 2027 based on two key indicators: population size and income level. Also due to covid having a very large impact on the United States in both changes in population and the economy I decided to only use data from 2020 and after since I beleive the predictions would be a bit more accurate. Because current AGI (Adjusted Gross Income) data only extends to 2022 and population data to 2024, the forcasting was done for 2027 using data from every available year for each category however the change over time is 2022 to 2027 for both population and AGI. Specifically:
  - Linear regression was used to predict AGI.
  - Polynomial regression was used to estimate population.
- Both absolute and percentage growth between 2022 and 2027 were calculated to assess the trajectory of each county. 

- ### Minnesota as a Baseline Filter
- Since the objective is to find out-of-state locations, Minnesota counties were excluded from the final ranking table. However, a separate table was created for select MN counties (Ramsey, Hennepin, Anoka, Washington, and Dakota) using the predicted growth for 2027. These counties served as a performance benchmark: any location underperforming compared to these regions was filtered out.
- To align with Minnesota’s economic profile, the following thresholds were applied:
  - Population must exceed 250,000
  - AGI must exceed $20,000,000
  - AGI per capita must exceed $50 
- ### Trends and Rankings
- During initial exploration, counties with larger baseline values tended to show greater absolute growth in both metrics. Meanwhile, counties with lower starting points often showed inflated percentage changes due to smaller denominators.
- To address this, the model emphasized percentage-based growth, with Minnesota’s benchmarks acting as a guardrail to eliminate outliers with misleading gains.
- Each remaining county was ranked by:
  - Percent change in population
  - Percent change in AGI
- A composite score was calculated by summing both ranks. The lower the composite score, the more favorable the county — with a score of 2 representing top performance in both categories.
- While the objective was to identify the top three locations, looking at the top ten all but two of the of the counties were in Florida or Texas showing a positive trend for counties in these states.