reverse colorbar #65

Makhsuda · 2020-06-25T00:02:20Z

No description provided.

chendaniely · 2020-06-25T00:36:06Z

I can get the plot to work, but I think it's best to change up the code so that we put in a place holder for the date. I think the dashboard can go and handle the animation instead of plotly directly. See code comment for changes I made to make the iteration process faster

chendaniely

If you change you code to this, it should at least plot faster...

chendaniely · 2020-06-25T00:37:07Z

analysis/db/us_map/choroplethMap.py

+
+# plt.show()
+# color_map = plt.cm.get_cmap('viridis')
+# reversed_viridis = color_map.reversed()


 fig = px.choropleth(molten_df,


I created a dataframe that was just a subset of a particular date, and then used that subseted dataframe to plot the figure

plot_data = molten_df[molten_df.date_iso == '2020-02-01'] fig = px.choropleth(plot_data, geojson=counties, locations=plot_data.fips_str, color='value', #animation_frame='date', hover_data=['State', 'value'], color_continuous_scale='viridis_r', range_color=(0, 300), scope="usa", title='Confirmed cases', labels={'value': 'confirmed cases'} )

chendaniely · 2020-06-25T00:37:31Z

analysis/db/us_map/choroplethMap.py

@@ -34,7 +34,10 @@
 molten_df['date_iso'] = pd.to_datetime(molten_df['date'], format="%m/%d/%y")  # change date to ISO8601 standard format

 fips = molten_df['fips_str'].tolist()


Because of the below changes, you don't need this line anymore since you're passing in the column of values into the plotting function

chendaniely · 2020-06-30T17:49:16Z

analysis/db/us_map/choroplethMap.py


 confirmed_df = pd.read_csv('https://github.com/CSSEGISandData/COVID-19/raw/master/csse_covid_19_data/'
                           'csse_covid_19_time_series/time_series_covid19_confirmed_US.csv')
 loc_df = pd.read_excel(here('./data/db/original/maps/State_FIPS.xlsx'))
+pop_df = pd.read_excel(here('./data/db/original/maps/PopulationEstimates.xls'))  # population dataset for 2019


where did this dataset come from?

chendaniely · 2020-06-30T17:49:49Z

analysis/db/us_map/choroplethMap.py

                           'csse_covid_19_time_series/time_series_covid19_confirmed_US.csv')
 loc_df = pd.read_excel(here('./data/db/original/maps/State_FIPS.xlsx'))
+pop_df = pd.read_excel(here('./data/db/original/maps/PopulationEstimates.xls'))  # population dataset for 2019


should provide a download link to where you got these datasets from.

chendaniely · 2020-06-30T17:52:01Z

analysis/db/us_map/choroplethMap.py

+
+molten_pop_df = pd.merge(molten_df, pop_df, on='fips_str')  # add population per county
+grouped_by = molten_pop_df.groupby(['fips_str', 'date_iso', 'Admin2', 'POP_ESTIMATE_2019'])['value'].sum().reset_index()
+grouped_by['value'] = grouped_by['value']/grouped_by['POP_ESTIMATE_2019']   # get per capita value


don't overwrite the original 'value' column. you should make a new column (in this case something like 'total_per_cap') that is assigned the per capita value

chendaniely · 2020-06-30T17:53:30Z

analysis/db/us_map/choroplethMap.py

-                    color_continuous_scale="Viridis",
-                    range_color=(0, 300),
+                    color_continuous_scale='viridis_r',
+                    range_color=(0, 500),


why did you choose 500? can we set this to something like max(per_cap) and use a variable instead of hard-coding a value?

Yea, you are right and I am working on that. I was thinking of putting there the third quartile as 75%, cause when I am taking the max value, which is for New York, it is much higher than other states and that's why it gets a bit wrong coloring. I tried to use quartile's fuction, but range_color didn't accept my input. The same goes with per capita case, but there it shows another state with the highest cases number, which is very strange, so I am assuming that I might be doing wrong calculations

chendaniely · 2020-06-30T17:56:02Z

analysis/db/us_map/graphs.py

+'''
+# ax = sns.lineplot(x="date_iso", y="value", hue='Province_State', data=grouped_counts)  # show cases per state monthly
+# ax = sns.stripplot(x="date_iso", y="value", hue='Province_State', data=grouped_counts)
+# ax = sns.violinplot(x='date_iso', y='value', hue='Province_State', data=grouped_counts, palette="Set2", split=True,
+#                     scale="count", inner="quartile")
+# ax = sns.countplot(x="date_iso", hue='Province_State', data=grouped_counts)  # works better if there are certain dates
+# plt.tight_layout()
+# plt.show()
+'''


why did you comment these out? we could also add general values into the dashboard too

chendaniely · 2020-06-30T17:58:17Z

analysis/db/us_map/choroplethMap.py

+                    # animation_frame='date',
+                    hover_data=['Admin2', 'value', 'POP_ESTIMATE_2019'],
+                    color_continuous_scale='viridis_r',
+                    range_color=(0, plot_data['value'].max()),


when you use the new column variable name make sure you change this as well.

chendaniely · 2020-06-30T18:00:23Z

analysis/db/us_map/choroplethMap.py

-
-
-
+'''


Files should end with a new line

Also. might be worth having the raw count, and also the per-capita count as a toggle between the maps.
Since the only real difference between the plotting code is which column you're using to plot, we can make a function that takes a dataframe, and plotting column as input and returns the plot.

Would be able to use the function to return both plots that we would feed into the dashboard.

chendaniely · 2020-07-01T00:44:22Z

analysis/db/us_map/choroplethMap.py

 # TODO: See if rate is changing, counts over time (a 14 day sliding window count)
-# Choropleth map with time slider and hover text
+# TODO: Try to merge PopulationEstimates.xls to confirmed_df and remove State_FIPS.xlsx

 confirmed_df = pd.read_csv('https://github.com/CSSEGISandData/COVID-19/raw/master/csse_covid_19_data/'
                           'csse_covid_19_time_series/time_series_covid19_confirmed_US.csv')
 loc_df = pd.read_excel(here('./data/db/original/maps/State_FIPS.xlsx'))


link to where you got data from

reverse colorbar

f8c38bc

Makhsuda requested a review from chendaniely June 25, 2020 00:02

chendaniely requested changes Jun 25, 2020

View reviewed changes

Makhsuda added 3 commits June 26, 2020 13:08

add map with confirmed cases per capita

30f787c

add lineplot to see the rise of confirmed case numbers

30f3723

add noninteractive graphs

c065f06

chendaniely reviewed Jul 1, 2020

View reviewed changes

Makhsuda added 2 commits July 2, 2020 10:32

add datasets' original links

c086b33

change structure of the code

d1909ca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reverse colorbar #65

reverse colorbar #65

Makhsuda commented Jun 25, 2020

chendaniely commented Jun 25, 2020

chendaniely left a comment

chendaniely Jun 25, 2020

chendaniely Jun 25, 2020

chendaniely Jun 30, 2020

chendaniely Jun 30, 2020

chendaniely Jun 30, 2020

chendaniely Jun 30, 2020

Makhsuda Jul 2, 2020

chendaniely Jun 30, 2020

chendaniely Jun 30, 2020

chendaniely Jun 30, 2020

chendaniely Jul 1, 2020

		@@ -34,7 +34,10 @@
		molten_df['date_iso'] = pd.to_datetime(molten_df['date'], format="%m/%d/%y") # change date to ISO8601 standard format

		fips = molten_df['fips_str'].tolist()




		'''

reverse colorbar #65

Are you sure you want to change the base?

reverse colorbar #65

Conversation

Makhsuda commented Jun 25, 2020

chendaniely commented Jun 25, 2020

chendaniely left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment