**Task 2: Price Range Analysis**

Task list:

1. Determine the most common price range among all the restaurants.

2. Calculate the average rating for each price range.

3. Identify the color that represents the highest average rating among different price ranges.

In [125]:
import pandas as pd

In [126]:
new_data = pd.read_csv("new_data.csv")

**1. Getting the most common price range among all the restaurants**

In [127]:
common_price_range_check = new_data['Price range'].value_counts().reset_index(name='Count')
common_price_range_check

Unnamed: 0,Price range,Count
0,1,4444
1,2,3113
2,3,1408
3,4,586


The result above suggests that the most common price range is '1'

**2. Calculating the average rating for each price range**

In [128]:
avg_rating_byPriceRange = round(new_data.groupby('Price range')['Aggregate rating'].mean(), 1).reset_index(name='Average rating')
avg_rating_byPriceRange

Unnamed: 0,Price range,Average rating
0,1,2.0
1,2,2.9
2,3,3.7
3,4,3.8


The result above shows that the order of increasing rating is proportional to the order of increasing price range of a restaurant.  So that on the average, higher-priced restaurants tend to have the best customer rating compared to the others and least-priced restaurants tend to have the least ratings.

**3. Identifying the color that represents the highest average rating among different price ranges**

In [129]:
checkForColorBasedOnRatingAndPriceRange = new_data[['Price range', 'Aggregate rating', 'Rating color']]
checkForColorBasedOnRatingAndPriceRange

Unnamed: 0,Price range,Aggregate rating,Rating color
0,3,4.8,Dark Green
1,3,4.5,Dark Green
2,4,4.4,Green
3,4,4.9,Dark Green
4,4,4.8,Dark Green
...,...,...,...
9546,3,4.1,Green
9547,3,4.2,Green
9548,4,3.7,Yellow
9549,4,4.0,Green


In [130]:
# Test to Identify the color representing each average rating based on the price ranges
'''
The highest average rating for 'Price range 1' is 2.0
The highest average rating for 'Price range 2' is 2.94
The highest average rating for 'Price range 3' is 3.68
The highest average rating for 'Price range 4' is 3.82
'''

colorForAverageRatingForPriceRange1 = checkForColorBasedOnRatingAndPriceRange[checkForColorBasedOnRatingAndPriceRange['Aggregate rating']==2]
colorForAverageRatingForPriceRange1

Unnamed: 0,Price range,Aggregate rating,Rating color
1395,1,2.0,Red
5197,1,2.0,Red
7706,2,2.0,Red
8532,2,2.0,Red
9104,1,2.0,Red
9105,4,2.0,Red
9106,3,2.0,Red


We can see from the above that the color representing the highest average rating being 2.0 for Price range 1 is 'Red'.  Now for carrying out the color identification, the code below will be adopted.

In [131]:
# Map of rating to color
rating_to_color = new_data.drop_duplicates().set_index('Aggregate rating')['Rating color'].to_dict()
rating_to_color

{4.8: 'Dark Green',
 4.5: 'Dark Green',
 4.4: 'Green',
 4.9: 'Dark Green',
 4.0: 'Green',
 4.2: 'Green',
 4.3: 'Green',
 3.6: 'Yellow',
 4.7: 'Dark Green',
 3.0: 'Orange',
 3.8: 'Yellow',
 3.7: 'Yellow',
 3.2: 'Orange',
 3.1: 'Orange',
 0.0: 'White',
 4.1: 'Green',
 3.3: 'Orange',
 4.6: 'Dark Green',
 3.9: 'Yellow',
 3.4: 'Orange',
 3.5: 'Yellow',
 2.2: 'Red',
 2.9: 'Orange',
 2.4: 'Red',
 2.6: 'Orange',
 2.8: 'Orange',
 2.1: 'Red',
 2.7: 'Orange',
 2.5: 'Orange',
 1.8: 'Red',
 2.0: 'Red',
 2.3: 'Red',
 1.9: 'Red'}

In [132]:
# Find the color for each highest average rating
highest_ratings_colors = avg_rating_byPriceRange['Average rating'].map(rating_to_color)
highest_ratings_colors

# for setting the index name of the Series
# highest_ratings_colors.index.name = 'price_range'

0       Red
1    Orange
2    Yellow
3    Yellow
Name: Average rating, dtype: object

The above gives the corresponding rating color to the 'Average rating' but only display a series of the rating color without the corresponding 'Aggregate rating'.

To get the full view of the Average rating color to the corresponding rating color, an improved mapping method is adopted below:

In [133]:
# Improved mapping method

# Convert the dictionary to a Series for mapping
highest_ratings_colors = pd.Series(rating_to_color, name='color')

# Map colors to the DataFrame based on the Average rating
avg_rating_byPriceRange['color'] = avg_rating_byPriceRange['Average rating'].map(highest_ratings_colors)

# Display the final DataFrame with the color mapping
print("\nFinal DataFrame with Colors:")
print(avg_rating_byPriceRange)


Final DataFrame with Colors:
   Price range  Average rating   color
0            1             2.0     Red
1            2             2.9  Orange
2            3             3.7  Yellow
3            4             3.8  Yellow


In [134]:
for index, row in avg_rating_byPriceRange.iterrows():
    print(f"Average rating: {row['Average rating']}, Color: {row['color']}")

Average rating: 2.0, Color: Red
Average rating: 2.9, Color: Orange
Average rating: 3.7, Color: Yellow
Average rating: 3.8, Color: Yellow


In [135]:
'''
Note for educational purpose:

Code snippet 1: "highest_ratings_colors = avg_rating_byPriceRange['Average rating'].map(rating_to_color)" 
code snippet 2: "highest_ratings_colors = avg_rating_byPriceRange.map(rating_to_color)" 

Code snippet 1 works the same way as code snippet 2.  

Code snippet 1 is used when the dataset 'avg_rating_byPriceRange' already has its index reset.  This is because 'avg_rating_byPriceRange' which is supposed to be indexed by 'Price range' is now being indexed by unnamed column leading to the result as seen below:

    Price range	    Average rating
0	1	            2.0
1	2	            2.9
2	3	            3.7
3	4	            3.8

For this reason, the 'Average rating' column which is the reference point to map to the corresponding 'rating_to_column' dictionary is indicated directly on the mapping.

If 'avg_rating_byPriceRange' had been indexed by 'Price range' leading to the result below which is now a series, then it could be mapped directly on the 'rating_to_column' dictionary

Price range
1    2.0
2    2.9
3    3.7
4    3.8
Name: Aggregate rating, dtype: float64

'''

'\nNote for educational purpose:\n\nCode snippet 1: "highest_ratings_colors = avg_rating_byPriceRange[\'Average rating\'].map(rating_to_color)" \ncode snippet 2: "highest_ratings_colors = avg_rating_byPriceRange.map(rating_to_color)" \n\nCode snippet 1 works the same way as code snippet 2.  \n\nCode snippet 1 is used when the dataset \'avg_rating_byPriceRange\' already has its index reset.  This is because \'avg_rating_byPriceRange\' which is supposed to be indexed by \'Price range\' is now being indexed by unnamed column leading to the result as seen below:\n\n    Price range\t    Average rating\n0\t1\t            2.0\n1\t2\t            2.9\n2\t3\t            3.7\n3\t4\t            3.8\n\nFor this reason, the \'Average rating\' column which is the reference point to map to the corresponding \'rating_to_column\' dictionary is indicated directly on the mapping.\n\nIf \'avg_rating_byPriceRange\' had been indexed by \'Price range\' leading to the result below which is now a series, then i

**Conclusion**

In [138]:
for index, row in avg_rating_byPriceRange.iterrows():
    print(f"For price range {row['Price range']}, the average rating is {row['Average rating']} corresponding to {row['color']}")


For price range 1, the average rating is 2.0 corresponding to Red
For price range 2, the average rating is 2.9 corresponding to Orange
For price range 3, the average rating is 3.7 corresponding to Yellow
For price range 4, the average rating is 3.8 corresponding to Yellow


This shows that the color that represents the highest average rating among different price ranges is Yellow