Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trainings data and retraining of the Models #36

Open
teresa-m opened this issue Nov 23, 2021 · 2 comments
Open

Trainings data and retraining of the Models #36

teresa-m opened this issue Nov 23, 2021 · 2 comments
Assignees

Comments

@teresa-m
Copy link
Member

teresa-m commented Nov 23, 2021

We should have the definite feature cvs. trainings data stored somewhere for the following datasets:

  1. Paris human only RRIs used for occupied regions (paris_human_RRI)
  2. Paris human RRIs and RBP bindings sides used for occupied regions (paris_human_RBPs)
  3. Paris mouse only RRIs used for occupied regions (paris_mouse_RRI)
  4. Splash human only RRIs used for occupied regions (splash_human_RRI)
  5. Paris + SPlash human only RRIs used for occupied regions (paris_splash_human_RRI)
  6. paris_human_RRI + paris_human_RBPs + paris_splash_human_RRI (full_human)
  7. paris_human_RRI + paris_human_RBPs + paris_splash_human_RRI + splash_human_RRI (full)

For this data we should have the models with the following data combination:

  1. paris_human
  2. paris_human_rbp
  3. model: paris_mouse
  4. splash_human
  5. full_human_rri
  6. full_human
  7. full
@teresa-m
Copy link
Member Author

teresa-m commented Nov 23, 2021

TODO

  • check if we have new features
  • recalculate the feature files for all 4 dataset
  • Sent to Stefan
  • Stefan retrain the models

@teresa-m
Copy link
Member Author

teresa-m commented Nov 23, 2021

calls:

paris_human_RRI

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_noRBPs/paris_HEK293T_context_150_pos_occ_pos.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_human_RRI_pos.csv

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_noRBPs/paris_HEK293T_context_150_pos_occ_neg.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_human_RRI_neg.csv

paris_human_RBPs

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_c150_shortRBPs/paris_HEK_context_150_pos_occ_pos.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_human_RBPs_pos.csv

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_c150_shortRBPs/paris_HEK_context_150_pos_occ_neg.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_human_RBPs_neg.csv

paris_mouse_RRI

python get_features.py -i /vol/scratch/data/pos_neg_data_context/mouse_full_c150/paris_mouse_context_150_pos_occ_pos.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_mouse_RRI_pos.csv

python get_features.py -i /vol/scratch/data/pos_neg_data_context/mouse_full_c150/paris_mouse_context_150_pos_occ_neg.csv -f all -o /vol/scratch/data/features_files/all_feature_files/paris_mouse_RRI_neg.csv

splash_human_RRI

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_SPLASH_hES/SPLASH_hES_context_150_pos_occ_pos.csv -f all -o /vol/scratch/data/features_files/all_feature_files/splash_human_RRI_pos.csv

python get_features.py -i /vol/scratch/data/pos_neg_data_context/full_SPLASH_hES/SPLASH_hES_context_150_pos_occ_neg.csv -f all -o /vol/scratch/data/features_files/all_feature_files/splash_human_RRI_neg.csv

paris_splash_human_RRI

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/paris_human_RRI_pos.csv /vol/scratch/data/features_files/all_feature_files/splash_human_RRI_pos.csv > /vol/scratch/data/features_files/all_feature_files/paris_splash_human_RRI_pos.csv

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/paris_human_RRI_neg.csv /vol/scratch/data/features_files/all_feature_files/splash_human_RRI_neg.csv > /vol/scratch/data/features_files/all_feature_files/paris_splash_human_RRI_neg.csv

full_human

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/paris_splash_human_RRI_pos.csv /vol/scratch/data/features_files/all_feature_files/paris_human_RBPs_pos.csv > /vol/scratch/data/features_files/all_feature_files/full_human_pos.csv

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/paris_splash_human_RRI_neg.csv /vol/scratch/data/features_files/all_feature_files/paris_human_RBPs_neg.csv > /vol/scratch/data/features_files/all_feature_files/full_human_neg.csv

full

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/full_human_pos.csv /vol/scratch/data/features_files/all_feature_files/paris_mouse_RRI_pos.csv > /vol/scratch/data/features_files/all_feature_files/full_pos.csv

awk 'FNR==1 && NR!=1{next;}{print}' /vol/scratch/data/features_files/all_feature_files/full_human_neg.csv /vol/scratch/data/features_files/all_feature_files/paris_mouse_RRI_neg.csv > /vol/scratch/data/features_files/all_feature_files/full_neg.csv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants