In [None]:
import pandas as pd
import altair as alt 

In [None]:
data = pd.read_csv('neu_rmp.csv')
data.head(20)

In [None]:
rating_plot = alt.Chart(data).mark_circle(size=100).encode(
    x=alt.X('Average Rating (Out of 5):Q', title='Average Rating'),
    y=alt.Y('Number of Ratings:Q', title='Number of Ratings'),
    color=alt.condition(
        alt.datum['Average Rating (Out of 5)'] >= 4.5,  
        alt.value('green'), 
        alt.value('red') 
    ),
    tooltip=['First Name', 'Last Name','Department', 'Average Rating (Out of 5)', 'Number of Ratings']
).properties(
    title='The Distribution between Number of Ratings and Average Rating',
    width=600,
    height=400
).interactive()

rating_plot

### Explaination
This code creates an interactive scatter plot that shows how departments' average ratings compare to the number of ratings they’ve received. Departments with ratings of 4 or higher are marked in green, and those with lower ratings are in red. The x-axis represents the average rating, and the y-axis shows the number of ratings. Based on the plot, it looks like higher-rated departments tend to have fewer ratings, while the lower-rated ones have more feedback. Overall, it gives a nice visual of how each department performs and how popular they are.

In [None]:
sorted_data = data.sort_values(by='Number of Ratings', ascending=False)
bar_chart = alt.Chart(sorted_data.head(10)).mark_bar().encode(
    x=alt.X('Number of Ratings:Q'),
    y=alt.Y('Last Name:N', sort='-x'),
    color=alt.Color('Average Rating (Out of 5):Q'),
    tooltip=['First Name', 'Last Name', 'Department', 'Number of Ratings', 'Average Rating (Out of 5)']  
).properties(
    title='Top 10 Professors with the Most Ratings',
    width=600,
    height=400
)

bar_chart

### Explaination
This code sorts the dataset by the number of ratings in order and selects the top 10 professors with the most ratings. Then creates a bar chart where the x-axis shows the number of ratings, and the y-axis displays the professor’s last name. The color of the bars is based on the average rating, with lighter blues for lower ratings and darker blues for higher ratings. When you hover over each bar, the tooltip shows more details like the professor’s first name, department, number of ratings, and average rating.

In [None]:
scatter_plot = alt.Chart(data).mark_circle(size=100).encode(
    x=alt.X('Would Take Again (Percent):Q'),
    y=alt.Y('Level of Difficulty (Out of 5):Q'),
    color=alt.Color('Department:N'),  
    tooltip=['First Name', 'Last Name','Department', 'Would Take Again (Percent)', 'Level of Difficulty (Out of 5)'] 
).properties(
    title='Relationship Between Department, Would Take Again (%), and Level of Difficulty',
    width=600,
    height=400
).interactive()


scatter_plot

### Explaination
This code creates an interactive scatter plot that shows the relationship between departments, the percentage of students who would take the professor again, and the professor's level of difficulty. The x-axis represents the "Would Take Again" percentage, and the y-axis shows the "Level of Difficulty." Each point is colored by department, making it easy to distinguish professors from different areas. When you hover over a point, the tooltip displays the department, the "Would Take Again" percentage, and the difficulty level. The chart is interactive, allowing you to zoom and explore the data further.