-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translate R plotting scripts into Python and add get_version.py #17
Translate R plotting scripts into Python and add get_version.py #17
Conversation
@@ -0,0 +1,64 @@ | |||
from plotnine import ggplot, aes, geom_line, geom_point, theme_bw, xlab, ylab, scale_color_manual, facet_wrap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Python script that translated from these archived R script, no need to review.
@@ -0,0 +1,144 @@ | |||
##################################################################################### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Python script that translated from these archived R script, no need to review.
@@ -88,11 +88,11 @@ def plot_par_dens_ref_sim_comparison(age_agg_sim_df, ref_df): | |||
# colors = brewer.pal(n=num_colors, name='BrBG') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Python script that translated from these archived R script, no need to review.
…e_each_relationship.R)
…validation_comparisons_site.R)
Error about conflict dependencies when installing our package:
Pandas 1.2 is released on Dec 2020, not sure if idmtools can remove the requirement for pandas < 1.2 |
Clinton will remove this pandas<1.2 requirement in their upcoming release. |
…ate wait_* and report scripts.
# Conflicts: # create_plots/helpers_coordinate_each_relationship.R # report/archive/Malaria_model_validation_output.pdf # report/archive/Malaria_model_validation_output_10-05-2022_13-12-33.pdf # requirements.txt # simulations/manifest.py
loglik_df_asex_bench = get_dens_loglikelihood(combined_df=combined_df_asex, sim_column='benchmark') | ||
loglik_df_asex_bench.rename(columns={"loglikelihood": "benchmark_loglike_asex"}, inplace=True) | ||
# todo: need review, this dataframe is created but not used. Should we use it in line 234(loglik_df = pd.merge(...)) | ||
loglikelihood_comparison = pd.merge(loglik_df_asex, loglik_df_asex_bench, how="outer") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MAmbrose-IDM, this dataframe is created but not used. Should we use it in line 234(loglik_df = pd.merge(...))
coord_csv[coord_csv['site'] == cur_site]['infectiousness_to_mosquitos_ref'].iloc[0]) | ||
ref_df_cur = pd.read_csv(filepath_ref) | ||
# todo: local variable 'upper_ages' is assigned to but never used | ||
upper_ages = sorted(sim_df_cur['agebin'].unique()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 errors need to be fixed
- local variable 'upper_ages' is assigned to but never used,
- undefined name 'sim_df_cur'
mean_diff_df = mean_diff_df[mean_diff_df['change_abs_diff'].notnull()] | ||
#todo: need code review in the following line: | ||
# R code: mean_diff_df$abs_diff_changed = abs(mean_diff_df$change_abs_diff)/mean_diff_df$mean_abs_diff_bench > rel_change_threshold | ||
mean_diff_df['abs_diff_changed'] = abs(mean_diff_df['change_abs_diff']) / mean_diff_df['mean_abs_diff_bench'] > rel_change_threshold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need code review here
mean_diff_df['change_type'] = 'better' | ||
mean_diff_df[mean_diff_df['change_abs_diff'] < 0]['change_type'] = 'worse' | ||
mean_diff_df[~mean_diff_df['abs_diff_changed']]['change_type'] = 'similar' | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I ran a quick test, it looked like ['change_type'] should come before [mean_diff_df['change_abs_diff'] or [~mean_diff_df['abs_diff_changed']] for this to work:
`mean_diff_df['change_type'][mean_diff_df['change_abs_diff'] < 0] = 'worse'
mean_diff_df['change_type'][~mean_diff_df['abs_diff_changed']] = 'similar'`
Use python stats.lingregress() function in a for loop in place of the nest() approach used in R for calculating linear regression / correlation information between the reference and simulation values. Note: edited section was tested separately, but still needs to be tested in context.
5 scripts under review:
We will need to resolve some functions before we can run and test the main script, like the ~lm(), arrange(), etc.
Here is one of the plot from python script(please note that the ref_year is not grouped in the last plot):
Here is the original plot from R scripts: