Translate R plotting scripts into Python and add get_version.py #17

YeChen-IDM · 2022-06-09T21:18:37Z

5 scripts under review:

helpers_likelihood_and_metrics.py
helpers_plot_ref_sim_comparisons.py
helpers_coordinate_each_relationship.py
helpers_reformat_sim_ref_dfs.py
run_generate_validation_comparisons_site.py

We will need to resolve some functions before we can run and test the main script, like the ~lm(), arrange(), etc.

Here is one of the plot from python script(please note that the ref_year is not grouped in the last plot):

Here is the original plot from R scripts:

…metrics.R)

…_comparisons.R)

YeChen-IDM · 2022-06-15T22:27:06Z

create_plots/archive_organization/_20220603/helper_functions_age_inc_prev.py

@@ -0,0 +1,64 @@
+from plotnine import ggplot, aes, geom_line, geom_point, theme_bw, xlab, ylab, scale_color_manual, facet_wrap


This is a Python script that translated from these archived R script, no need to review.

YeChen-IDM · 2022-06-15T22:27:16Z

create_plots/archive_organization/_20220603/helper_functions_infection_duration.py

@@ -0,0 +1,144 @@
+#####################################################################################


This is a Python script that translated from these archived R script, no need to review.

YeChen-IDM · 2022-06-15T22:27:28Z

create_plots/archive_organization/_20220603/helper_functions_par_dens.py

@@ -88,11 +88,11 @@ def plot_par_dens_ref_sim_comparison(age_agg_sim_df, ref_df):
    # colors = brewer.pal(n=num_colors, name='BrBG')


This is a Python script that translated from these archived R script, no need to review.

…e_each_relationship.R)

…dfs.R)

…validation_comparisons_site.R)

YeChen-IDM · 2022-06-17T22:41:19Z

Error about conflict dependencies when installing our package:

ERROR: Cannot install idmtools-cli and malaria-model-validation because these package versions have conflicting dependencies.
The conflict is caused by:
    datar 0.0.0 depends on pandas<2.0 and >=1.2
    plotnine 0.1.0 depends on pandas>=0.19.0
    idmtools 1.6.6 depends on pandas<1.2 and >=1.1.4

Pandas 1.2 is released on Dec 2020, not sure if idmtools can remove the requirement for pandas < 1.2

YeChen-IDM · 2022-06-30T16:56:27Z

Error about conflict dependencies when installing our package:
ERROR: Cannot install idmtools-cli and malaria-model-validation because these package versions have conflicting dependencies.
The conflict is caused by:
    datar 0.0.0 depends on pandas<2.0 and >=1.2
    plotnine 0.1.0 depends on pandas>=0.19.0
    idmtools 1.6.6 depends on pandas<1.2 and >=1.1.4
Pandas 1.2 is released on Dec 2020, not sure if idmtools can remove the requirement for pandas < 1.2

Clinton will remove this pandas<1.2 requirement in their upcoming release.

…ate wait_* and report scripts.

# Conflicts: # create_plots/helpers_coordinate_each_relationship.R # report/archive/Malaria_model_validation_output.pdf # report/archive/Malaria_model_validation_output_10-05-2022_13-12-33.pdf # requirements.txt # simulations/manifest.py

YeChen-IDM · 2022-08-31T17:32:57Z

create_plots/helpers_coordinate_each_relationship.py

+        loglik_df_asex_bench = get_dens_loglikelihood(combined_df=combined_df_asex, sim_column='benchmark')
+        loglik_df_asex_bench.rename(columns={"loglikelihood": "benchmark_loglike_asex"}, inplace=True)
+        # todo: need review, this dataframe is created but not used. Should we use it in line 234(loglik_df = pd.merge(...))
+        loglikelihood_comparison = pd.merge(loglik_df_asex, loglik_df_asex_bench, how="outer")


@MAmbrose-IDM, this dataframe is created but not used. Should we use it in line 234(loglik_df = pd.merge(...))

YeChen-IDM · 2022-08-31T17:40:21Z

create_plots/helpers_reformat_sim_ref_dfs.py

+                                    coord_csv[coord_csv['site'] == cur_site]['infectiousness_to_mosquitos_ref'].iloc[0])
+        ref_df_cur = pd.read_csv(filepath_ref)
+        # todo: local variable 'upper_ages' is assigned to but never used
+        upper_ages = sorted(sim_df_cur['agebin'].unique())


2 errors need to be fixed

local variable 'upper_ages' is assigned to but never used,

undefined name 'sim_df_cur'

…1713364, see details in #22 (cherry picked from commit cd7c875)

YeChen-IDM · 2022-09-02T21:40:40Z

Update Python plotting script according to R script fix in #19, commit 1713364, see details in #22

YeChen-IDM · 2022-09-02T21:43:04Z

create_plots/helpers_likelihood_and_metrics.py

+    mean_diff_df = mean_diff_df[mean_diff_df['change_abs_diff'].notnull()]
+    #todo: need code review in the following line:
+    # R code: mean_diff_df$abs_diff_changed = abs(mean_diff_df$change_abs_diff)/mean_diff_df$mean_abs_diff_bench > rel_change_threshold
+    mean_diff_df['abs_diff_changed'] = abs(mean_diff_df['change_abs_diff']) / mean_diff_df['mean_abs_diff_bench'] > rel_change_threshold


need code review here

MAmbrose-IDM · 2022-09-08T19:50:59Z

create_plots/helpers_likelihood_and_metrics.py

+    mean_diff_df['change_type'] = 'better'
+    mean_diff_df[mean_diff_df['change_abs_diff'] < 0]['change_type'] = 'worse'
+    mean_diff_df[~mean_diff_df['abs_diff_changed']]['change_type'] = 'similar'
+


When I ran a quick test, it looked like ['change_type'] should come before [mean_diff_df['change_abs_diff'] or [~mean_diff_df['abs_diff_changed']] for this to work:

`mean_diff_df['change_type'][mean_diff_df['change_abs_diff'] < 0] = 'worse'

mean_diff_df['change_type'][~mean_diff_df['abs_diff_changed']] = 'similar'`

Use python stats.lingregress() function in a for loop in place of the nest() approach used in R for calculating linear regression / correlation information between the reference and simulation values. Note: edited section was tested separately, but still needs to be tested in context.

YeChen-IDM · 2022-09-20T22:59:55Z

Results in Python(left) vs. results in R(right):

add helpers_likelihood_and_metrics.py(replace helpers_likelihood_and_…

d91a5f9

…metrics.R)

YeChen-IDM changed the title ~~Translate R scripts into Python~~ Translate R plotting scripts into Python Jun 9, 2022

YeChen-IDM added 12 commits June 9, 2022 14:29

python index starts at 0 while R starts at 1!

20d4594

fix linter errors

43400b5

add helpers_plot_ref_sim_comparisons.py (replace helpers_plot_ref_sim…

0ded83e

…_comparisons.R)

should ignore W503 not W504

4eb1f76

make linter happy

f28d911

fix undefuned variable pos_thresh_dens in get_duration_bins()

f2f8dea

move more Python scripts into archive folder

b251a7a

fix some syntax errors

020bb3f

more syntax fix

eeb11ff

move R scripts back until Python scripts can work

90e54db

remove the R achive

4d7fa1f

translate 2 more functions in helpers_likelihood_and_metrics.py

e1aebc7

YeChen-IDM commented Jun 15, 2022

View reviewed changes

YeChen-IDM added 5 commits June 15, 2022 15:39

some fix

203eef4

add helpers_coordinate_each_relationship.py(replace helpers_coordinat…

1e748f3

…e_each_relationship.R)

add helpers_reformat_sim_ref_dfs.py(replace helpers_reformat_sim_ref_…

167a381

…dfs.R)

update manifest

8862045

add run_generate_validation_comparisons_site.py(replace run_generate_…

9b6ce7e

…validation_comparisons_site.R)

YeChen-IDM added 2 commits June 17, 2022 15:41

more fixes

b5e7db7

fixes

c520d74

add get_version.py to get eradication and emodpy-malaria version. Upd…

4443ce8

…ate wait_* and report scripts.

YeChen-IDM changed the title ~~Translate R plotting scripts into Python~~ Translate R plotting scripts into Python and add get_version.py Jul 5, 2022

YeChen-IDM added 2 commits July 6, 2022 13:44

update report with current result and remove previous draft reports

e8f13bf

Update version_file_filepath in create_pdf_report_3.py

ee3e4eb

This was referenced Jul 26, 2022

Find equivalents in Python for multiple R functions and test #24

Closed

Write down Eradication version and Emodpy-malaria version in final report #31

Closed

Add plotting and reporting steps into snakemake pipeline(Pending) #32

Closed

YeChen-IDM commented Aug 31, 2022

View reviewed changes

YeChen-IDM added 2 commits August 31, 2022 10:41

update todo comment

c34a7df

Update Python plotting script according to R script fix in #19, commit …

34dff14

…1713364, see details in #22 (cherry picked from commit cd7c875)

YeChen-IDM commented Sep 2, 2022

View reviewed changes

YeChen-IDM mentioned this pull request Sep 8, 2022

Update Python plotting script according to R script fix in #19 #22

Closed

MAmbrose-IDM reviewed Sep 8, 2022

View reviewed changes

MAmbrose-IDM and others added 5 commits September 8, 2022 13:00

Update helpers_likelihood_and_metrics.py

b4f9905

fix plot_ref_sim_comparison()

2c7f0a5

Update Python plotting code, fix for #24

da735a0

add both result from R and Python

07485cf

YeChen-IDM mentioned this pull request Sep 20, 2022

Plotting #35

Closed

YeChen-IDM added 3 commits September 20, 2022 15:52

replace stats.multinomial

1806894

update result

fef09f4

make lint happy

6a56055

YeChen-IDM merged commit cbb7305 into InstituteforDiseaseModeling:main Sep 20, 2022

YeChen-IDM mentioned this pull request Oct 25, 2022

Integrate Python plots into the generate report/pdf script. #41

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translate R plotting scripts into Python and add get_version.py #17

Translate R plotting scripts into Python and add get_version.py #17

YeChen-IDM commented Jun 9, 2022 •

edited

Loading

YeChen-IDM Jun 15, 2022

YeChen-IDM Jun 15, 2022

YeChen-IDM Jun 15, 2022

YeChen-IDM commented Jun 17, 2022

YeChen-IDM commented Jun 30, 2022

YeChen-IDM Aug 31, 2022

YeChen-IDM Aug 31, 2022

YeChen-IDM commented Sep 2, 2022

YeChen-IDM Sep 2, 2022

MAmbrose-IDM Sep 8, 2022 •

edited

Loading

YeChen-IDM commented Sep 20, 2022

		@@ -0,0 +1,64 @@
		from plotnine import ggplot, aes, geom_line, geom_point, theme_bw, xlab, ylab, scale_color_manual, facet_wrap

		@@ -0,0 +1,144 @@
		#####################################################################################

		@@ -88,11 +88,11 @@ def plot_par_dens_ref_sim_comparison(age_agg_sim_df, ref_df):
		# colors = brewer.pal(n=num_colors, name='BrBG')

Translate R plotting scripts into Python and add get_version.py #17

Translate R plotting scripts into Python and add get_version.py #17

Conversation

YeChen-IDM commented Jun 9, 2022 • edited Loading

YeChen-IDM Jun 15, 2022

Choose a reason for hiding this comment

YeChen-IDM Jun 15, 2022

Choose a reason for hiding this comment

YeChen-IDM Jun 15, 2022

Choose a reason for hiding this comment

YeChen-IDM commented Jun 17, 2022

YeChen-IDM commented Jun 30, 2022

YeChen-IDM Aug 31, 2022

Choose a reason for hiding this comment

YeChen-IDM Aug 31, 2022

Choose a reason for hiding this comment

YeChen-IDM commented Sep 2, 2022

YeChen-IDM Sep 2, 2022

Choose a reason for hiding this comment

MAmbrose-IDM Sep 8, 2022 • edited Loading

Choose a reason for hiding this comment

YeChen-IDM commented Sep 20, 2022

YeChen-IDM commented Jun 9, 2022 •

edited

Loading

MAmbrose-IDM Sep 8, 2022 •

edited

Loading