# Assignment Background:
The United States, home to approximately 320 million citizens, is a large country made of 50 states and the nation’s capital, the District of Columbia (also called Washington D.C.). In this project, you will explore some of the economic statistics of each state and create some visualization of the data. No data on U.S. territories are included in the file.

Data will be read from a CSV (comma-separated values) file called State_Data.txt.  This file has not header record, so you may read it as a plain text file and split the line to separate the individual values.

Each row in the file contains the following information on each state in the United States from 2010. Values are separated by commas.:
-	1st Value: State
-	2nd Value: Region (defined by the Bureau of Economic Analysis)
-	3rd Value: Population (in millions)
o	The total number of people living in the state.
-	4th Value: GDP (in billions)
o	Measure of the state’s economic activity, a higher GDP means higher monetary value for goods and services within the state’s boarder.
-	5th Value: Personal Income (in billions)
o	All incomes received by individuals and households.
-	6th Value: Subsidies (in millions)
o	Money granted by the state’s government to help an industry or business.
-	7th Value: Compensation of Employees (in billions)
o	Pre-taxed wages paid by employers to employees.
-	8th Value: Taxes on Production and Imports (in billions)
o	Taxes chargeable to business expenses of producing and importing

Note that Python uses zero-based indexing, meaning that when data is put in a list the first value (State) will be found by taking the 0th index of the list.

Also recognize that some values are in millions while others are in billions, so when using operations on two values (which you will) make sure to adjust them accordingly. 

The Bureau of Economic Analysis contains the following regions.  This information is already in the provided data file.
-	Far_West: Alaska, California, Hawaii, Nevada, Oregon, Washington
-	Great_Lakes: Illinois, Indiana, Michigan, Ohio, Wisconsin
-	Mideast: Delaware, District of Columbia (Washington D.C.), Maryland, New_Jersey, New_York, Pennsylvania
-	New_England: Connecticut, Maine, Massachusetts, New_Hampshire, Rhode_Island, Vermont
-	Plains: Iowa, Kansas, Minnesota, Missouri, Nebraska, North_Dakota, South_Dakota
-	Rocky_Mountain: Colorado, Idaho, Montana, Utah, Wyoming
-	Southeast: Alabama, Arkansas, Florida, Georgia, Kentucky, Louisiana, Mississippi, North_Carolina, South_Carolina, Tennessee, Virginia, West_Virginia
-	Southwest: Arizona, New_Mexico, Oklahoma, Texas

This data is provided by the U.S. Bureau of Economic Analysis.


# Assignment Description

You are to provide economic statistics by region as follows:
-  Total population for the region (sum of the column value)
-  Total GDP for the region (sum of the fourth column)
-  Average population in the region (total population / number of states in region)
-  Average personal income (Total of fifth column / total population)

Your code must provide the following:
-  Prompt the user for a region name.
-  Read the data from the file selecting only the data for the region.
-  Build three dictionaries for the region for the population, GDP, and personal income.  The key for each dictionary is the State.
-  In the case where the user entered an invalid region, an appropriate error message should be displayed.  Note, you will not know this until you've read through the file and found no data.
-  Write a function called calc_total_pop is passed the population dictionary and returns the total population.  You will obviously need to iterate through the dictionary to produce the total population.
-  Write a function called calc_total_gdp that is passed the gdp dictionary and returns the total GDP.
-  Write a function called calc_total_pi that is passed the personal income dictionary and returns the total personal income.

The output of your must be as follows:
-  A list (not python list) of states in the region.
-  The Total population for the region.
-  The total GDP for the region.
-  The average population of the region.
-  The average personal income in the region.

Include any appropriate captions and labels to describe the above date. Below is a suggest output format.  Points will be deducted for an *untidy* report. 

Economic statistics for the Great_Lakes region:  
&nbsp;&nbsp;&nbsp;&nbsp;States in Region:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Illinois, Indiana, Michigan, Ohio, Wisconsin  
&nbsp;&nbsp;&nbsp;&nbsp;Total population:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;46.436  million  
&nbsp;&nbsp;&nbsp;&nbsp;Average population:&nbsp;&nbsp;9.2872  million  
&nbsp;&nbsp;&nbsp;&nbsp;Total GDP:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1776.04  billion  
&nbsp;&nbsp;&nbsp;&nbsp;Average PI:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;37669.85  

You may want to investigate the *round* function to reduce the number for decimals in floating point numbers.  The *join* function will provide handy for showing all States on one line.

For easier debugging and clarity, use multiple cells to modularize your program.  One model would be three cells where:
-  The first cell reads the file and builds the data structures.
-  The second cell performs the calculations.
-  The last cell displays the report

In [10]:
def findRegion(input_region):
    regionSearch = str(input_region)
    info_states = []
 
    with open('Economic_Data_2010.txt', 'r') as file:
        
        for line in file:
                col = line.split(",")
                info_states.append(col)
        i = 0
        
        regions = []
 
        for state in info_states:
            region = info_states[i][1].lower()
            if region not in regions:
                regions.append(region)
                
            i += 1
            
        if regionSearch not in regions:
            print(f'"{regionSearch.title()}" that region was not found in the list, please try another region.')
            
            exit(1)
            
        else:
            data = gatherData(regionSearch, info_states)
            pop_region = data[0]
            gdp_region = data[1]
            income_region = data[2]
            output = formatOutput(regionSearch, pop_region, gdp_region, income_region)
     
            for items in range(len(output)):
                print(output[items])
                
def gatherData(searchState, info_states):
    pop_region = {}
    gdp_region = {}
    income_region = {}
    i = 0
    
    for state in range(len(info_states)):
        if str(info_states[i][1]).lower() == searchState:
            if str(info_states[i][0]).lower() not in pop_region.keys():
                pop_region[str(info_states[i][0]).lower()] = info_states[i][2]
                
            if str(info_states[i][0]).lower() not in gdp_region.keys():
                gdp_region[str(info_states[i][0]).lower()] = info_states[i][3]
                
            if str(info_states[i][0]).lower() not in income_region.keys():
                income_region[str(info_states[i][0]).lower()] = info_states[i][4]
                
        i += 1
        
    return pop_region, gdp_region, income_region

def calc_total_pop(population):
    
    pop_sum = 0
    
    for a in population:
        pop_sum += float(population.get(a))
        
    return round(pop_sum, 2)

def calc_total_gdp(gdp):
    
    sum_gdp = 0
    
    for a in gdp:
        sum_gdp += float(gdp.get(a))
        
    return round(sum_gdp, 2)

def calc_total_pi(income):
    
    sum_income = 0
    
    for a in income:
        sum_income += float(income.get(a))
        
    return round(sum_income, 2)

def formatOutput(state, pop, gdp, income):
    
    pop_sum = calc_total_pop(pop)
    avg_pop = round(pop_sum / len(pop), 2)
    sum_gdp = calc_total_gdp(gdp)
    sum_income = calc_total_pi(income)
    avg_income = round((sum_income * 1000000000) / (pop_sum* 1000000), 2)
    state_region = []
    
    for key in pop:
        state_region.append(key.title())
        
    region_states_string = ', '.join(state_region)

    stats = str(f"Economic statistics for the {state.title()} region:")
    regionStates = str(f"\tStates in Region:\t{region_states_string}")
    regionPop = str(f"\tTotal Population:\t{pop_sum} million")
    regionAvgPop = str(f"\tAverage Population:\t{avg_pop} million")
    regionGDP = str(f"\tTotal GDP:\t\t${sum_gdp} billion")
    regionAvgIncome = str(f"\tAverage Income:\t\t${avg_income}")
    
    return stats, regionStates, regionPop, regionAvgPop, regionGDP, regionAvgIncome

regionRequest = input("Enter the region in question: ").lower()

findRegion(regionRequest)

Enter the region in question: Great_Lakes
Economic statistics for the Great_Lakes region:
	States in Region:	Illinois, Indiana, Michigan, Ohio, Wisconsin
	Total Population:	46.44 million
	Average Population:	9.29 million
	Total GDP:		$1776.04 billion
	Average Income:		$37666.67
