## Introduction
A city can be defined as 'best' in countless different ways. However, for this situation we are particularly examining the best places that a plant could grow, given the soil type of that area. This metric was determined by the groups question "which Pittsburgh area neighborhood would be best for the growth and life of a plant." Plant life is largely determined by the ground it grows on, and what the ground is made up of, which is why the Allegheny County soil type data set was chosen. Plant life is also determined by the type of water and rainfall it recieves, and the type of air and/or pollutants that surround it. However, soil type is what I have chosen to go with.
## The Metric 
I will be using the [Allegheny County Soil Type Areas](https://data.wprdc.org/dataset/allegheny-county-soil-type-areas1) dataset from http://www.wprdc.org </br>
This data set contains two main variables that will determine if plant life is viable in the area; Soil Code and Class. Soil Code is given a multiplier of 2, and class is giving a multiplier of 5, which was determined by the level of importance of each in the growth of plant life
</br>
#### Soil Code
Soil Code is a 2-3 letter code. Codes beginning with S or U are designated as being areas with Stripmines or Urban development respectively. Thus, any area with this code is poor for plantlife, and is not considered. From there, most soil types are just as well as the others, so we examine another sub-metric of Soil Code; steepness. In 3 letter codes, the last letter is either A, B, C, D, E, or F, which determine the steepness. A is a soil area that is nearly level at 0-2 percent slope, B is 2-8% slope, C:8-15% slope, etc. Steepness beyond 15% would again, be poor for plantlife, so we are only considering plants with a steepness code of A, B, C, where A gives the highest score, and C gives the lowest. The score of each area is then multiplied by 2, as steepness, while important in plant growth, is less of a factor overall than our next sub-metric: Class
![](codes.png)
</br>
#### Class
Soils are placed into classes, between class 1, and class 4. Class 1 is categorized as typically high-yield, well fertilized soil. On the other end of the spectrum, class 4 is categorized as poor soil with a low ability to support plant-growth. So, we examine the soil class of each area, and add to its score. We multiply the inverse of the class by the class multiplier (for example, a soil class of 4 gives us 1 * multiplier, while a soil class of 1 gives us 4 * multiplier). [Source](https://www.nrcs.usda.gov/Internet/FSE_MANUSCRIPTS/pennsylvania/PA003/0/allegheny.pdf)
![](classes.png)

In [5]:
import pandas as pd
soils = pd.read_csv('soils.csv', sep=',')
#This dictionary will contain the soilFID (a tag to identify each soil area sample) and it's corresponding score
soilRatings = {}
soilFIDS = soils['FID']
#These multipliers determine how important they scale in their importance for plant growth
steepnessMultiplier = 3
classMultiplier = 5

#This measures steepness. Values of F (very steep, bad for plant growth) are given a minimum value of 0, while A (level ground, good for plants) is given a potential 6 * multiplier
soilCodes = soils['SOIL_CODE']
for i in range(len(soilCodes) - 1):
    if (soilCodes[i][0] != 'S' or soilCodes[i][0] != 'U') and len(soilCodes[i]) > 2:
        for j in range(65, 68):
            if chr(j) == soilCodes[i][2]:
                if soilFIDS[i] in soilRatings:
                    soilRatings[soilFIDS[i]] += steepnessMultiplier * -(j - 68)
                else:
                    soilRatings[soilFIDS[i]] = steepnessMultiplier * -(j - 68)
#This measures class. Lowest classes (good for plant growth) get a maximum 4 * class, while highest classes (poor growing conditions) get 0
soilNums = soils['CLASS']
for i in range(len(soilCodes) - 1):
    if len(soilNums[i]) == 1:
        j = int(soilNums[i])
        if soilFIDS[i] in soilRatings:
            soilRatings[soilFIDS[i]] += classMultiplier * -(j - 5)

#Creates a new dictionary of only the soil sample areas with a score higher than the threshold
threshold = 22
topPicks = {}
rankings = []
for i in soilRatings:
    if soilRatings[i] > threshold:
        if len(str(soilFIDS[i-1])) < 3:
            fL = str(soilFIDS[i - 1])[0]
        else:
            fL = str(soilFIDS[i - 1])[0] + str(soilFIDS[i - 1])[1]
            actual = str(soilFIDS[i - 1])
        if fL in topPicks:
            topPicks[fL] = topPicks[fL] + ', ' + actual
        else:
            topPicks[fL] = actual
print(topPicks)

{'11': '112, 1111, 11077, 11159, 11205, 11460, 11632, 11715, 11718', '30': '3063', '35': '3546', '40': '4093', '45': '4545', '57': '5709', '68': '6873', '74': '7439', '82': '8287', '83': '8329', '84': '8444, 8470', '87': '8728', '88': '8858', '92': '9239, 9250, 9253', '93': '9322', '96': '9653', '10': '10644, 10857, 10867, 10874', '12': '12076, 12218, 12220, 12560, 12837, 12868', '13': '13624, 13699, 13869', '15': '15284, 15534, 15545, 15615, 15753, 15831, 15934', '16': '16456', '19': '19324, 19568, 19884', '20': '20430, 20573, 20849, 20918, 20978', '21': '21007, 21270', '22': '22100, 22157, 22426, 22792', '23': '23343, 23532, 23647, 23703, 23994', '24': '24015, 24094, 24324, 24547, 24882', '25': '25027, 25052, 25117, 25141, 25167, 25321, 25348, 25391, 25434, 25498, 25523, 25530, 25862', '26': '26199, 26212, 26519, 26845'}


## The Best Neighborhood
We combine the data from our Soil Code and Class calculations to give us a list of the areas in Pittsburgh with the highest overall scores. Cross referencing this list of areas with the [map](https://openac-alcogis.opendata.arcgis.com/datasets/AlCoGIS::allegheny-county-soil-type-areas/explore), we can create a list of where the best soil locations in Allegheny county exist. Neighborhoods with a more of these well-soiled areas are placed higher on the list. Doing this we come up with **Squirell Hill** as being the best neighborhood. This contains the largest area of the top rated soil in the Pittsburgh area

## Conclusion
We have found that Squirell Hill is the best neighborhood. Personally, I agree as I believe that the best neighborhood would be Squirell Hill. Squirell Hill contains a lot of open land and greenery. Schenley Park is pretty void of urban development, and has a lot of great potential if you were a plant. Clearly this area is best for plant-life