## Inspiration Index

- Data Source:
  - `public_art`
  - `museums`
  - `libraries`
  - `parks_and_facilities`
  - `community_nonprofit_orgs`
  - `faith-based_facilities`

In [2]:
import pandas as pd

# public art
public_art = pd.read_csv('../../data_cleaned/assets/public_art.csv')
public_art_count = public_art['zip_code'].value_counts()
print("public_art:")
print("  mean:", public_art_count.mean())
print("  median:", public_art_count.median())
print("  max:", public_art_count.max())

# museums
museums = pd.read_csv('../../data_cleaned/assets/museums.csv')
museums_count = museums['zip_code'].value_counts()
print("museums:")
print("  mean:", museums_count.mean())
print("  median:", museums_count.median())
print("  max:", museums_count.max())

# libraries
libraries = pd.read_csv('../../data_cleaned/assets/libraries.csv')
libraries_count = libraries['zip_code'].value_counts()
print("libraries:")
print("  mean:", libraries_count.mean())
print("  median:", libraries_count.median())
print("  max:", libraries_count.max())

# parks
parks = pd.read_csv('../../data_cleaned/assets/parks_and_facilities.csv')
parks_count = parks['zip_code'].value_counts()
print("parks:")
print("  mean:", parks_count.mean())
print("  median:", parks_count.median())
print("  max:", parks_count.max())

# nonprofit orgnizations
nonprofit_orgs = pd.read_csv('../../data_cleaned/assets/community_nonprofit_orgs.csv')
nonprofit_orgs_count = nonprofit_orgs['zip_code'].value_counts()
print("nonprofit_orgs:")
print("  mean:", nonprofit_orgs_count.mean())
print("  median:", nonprofit_orgs_count.median())
print("  max:", nonprofit_orgs_count.max())

# faith-based facilities
faith_based = pd.read_csv('../../data_cleaned/assets/faith-based_facilities.csv')
faith_based_count = faith_based['zip_code'].value_counts()
print("faith_based:")
print("  mean:", faith_based_count.mean())
print("  median:", faith_based_count.median())
print("  max:", faith_based_count.max())

public_art:
  mean: nan
  median: nan
  max: nan
museums:
  mean: nan
  median: nan
  max: nan
libraries:
  mean: 1.1363636363636365
  median: 1.0
  max: 2
parks:
  mean: nan
  median: nan
  max: nan
nonprofit_orgs:
  mean: 50.09340659340659
  median: 6.0
  max: 1717
faith_based:
  mean: 9.76595744680851
  median: 8.0
  max: 36


#### Inspiration Index Calculation

- According to the number, we find out a good scale of the combination of the amount of libraries, nonprofit organizations, and faith-based facilities in the community, that is 1:64:8
- The score is calculated as follows:
  - `Tastescape Score = (Libraries * 64 + Nonprofit Organizations + Faith-based Facilities * 8)`
  - We need to normalize the score to 0-100, so we need to find the max and min of the score in the dataset, and calculate the final score

In [5]:
# create a new dataframe with the counts of each zip code
zip_code_counts = pd.DataFrame({
    'zip_code': libraries_count.index,
    'libraries_count': libraries_count.values,
    'nonprofit_orgs_count': nonprofit_orgs_count.reindex(libraries_count.index, fill_value=0).values,
    'faith_based_count': faith_based_count.reindex(libraries_count.index, fill_value=0).values,
})
# print head 10 of the new dataframe
print(zip_code_counts.head(10))

   zip_code  libraries_count  nonprofit_orgs_count  faith_based_count
0     15210                2                    68                 21
1     15212                2                  1717                 36
2     15213                2                   178                 28
3     15101                1                   101                 11
4     15217                1                   196                 24
5     15204                1                    25                  7
6     15203                1                   108                 10
7     15106                1                   106                 17
8     15211                1                    38                  7
9     15241                1                   111                 11


In [21]:
# calculate the score for each zip code
zip_code_counts['score'] = (zip_code_counts['libraries_count'] * 64 +
                            zip_code_counts['nonprofit_orgs_count'].clip(upper=200) * 1 +
                            zip_code_counts['faith_based_count'] * 8)

zip_code_counts = zip_code_counts.sort_values(by='score', ascending=False)
# print head 10 of the new dataframe

print(zip_code_counts.head(10))
# normalize the score to be between 0 and 1
zip_code_counts['score'] = (zip_code_counts['score'] - zip_code_counts['score'].min()) / (zip_code_counts['score'].max() - zip_code_counts['score'].min())
# sort the dataframe by score in descending order
# save the new dataframe to a csv file
tastescape_scores = zip_code_counts[['zip_code', 'score']]
print(tastescape_scores.head(10))
tastescape_scores.to_csv('../../data_score/inspiration_index.csv', index=False)

    zip_code  libraries_count  nonprofit_orgs_count  faith_based_count  score
1      15212                2                  1717                 36    616
13     15219                1                   418                 35    544
2      15213                2                   178                 28    530
17     15206                1                   220                 29    496
4      15217                1                   196                 24    452
16     15222                1                   501                 13    368
0      15210                2                    68                 21    364
11     15208                1                    78                 25    342
21     15220                1                   122                 17    322
7      15106                1                   106                 17    306
    zip_code     score
1      15212  1.000000
13     15219  0.847134
2      15213  0.817410
17     15206  0.745223
4      15217  0.651805
16  