Skip to content

Commit

Permalink
Edited sampler due to inconsistency between pandas for python 2.7/3.6
Browse files Browse the repository at this point in the history
  • Loading branch information
aewebb80 committed Aug 8, 2017
1 parent c9f18f4 commit ea617ee
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 1 deletion.
15 changes: 15 additions & 0 deletions andrew/pipeline.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
2017-08-08 16:11:08,856 - logArgs - INFO: Arguments for vcf_sampler:
2017-08-08 16:11:08,856 - logArgs - INFO: Arguments uniform_bins: 10
2017-08-08 16:11:08,856 - logArgs - INFO: Arguments vcfname: example/input/merged_chr1_10000.vcf.gz
2017-08-08 16:11:08,856 - logArgs - INFO: Arguments random_seed: 1000
2017-08-08 16:11:08,856 - logArgs - INFO: Arguments statistic_file: example/merged_chr1_10000.windowed.weir.fst
2017-08-08 16:11:08,856 - logArgs - INFO: Arguments calc_statistic: windowed-weir-fst
2017-08-08 16:11:08,856 - logArgs - INFO: Arguments sample_size: 20
2017-08-08 16:11:08,856 - logArgs - INFO: Arguments sample_file: sampled_data.tsv
2017-08-08 16:11:08,856 - logArgs - INFO: Arguments sampling_scheme: random
2017-08-08 16:11:08,856 - logArgs - INFO: Arguments statistic_window_size: 10000
2017-08-08 16:11:08,856 - logArgs - INFO: Arguments out: out.vcf.gz
2017-08-08 16:11:08,859 - run - INFO: Sample (i.e. statistic) file assigned
2017-08-08 16:11:08,859 - run - INFO: Random sampling complete
2017-08-08 16:11:08,870 - run - INFO: Created selected samples file
2017-08-08 16:11:09,951 - run - INFO: Created selected VCF file
21 changes: 21 additions & 0 deletions andrew/sampled_data.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
CHROM BIN_START BIN_END N_VARIANTS WEIGHTED_FST MEAN_FST
2 chr1 1050001 1060000 16 0.711214 0.263166
3 chr1 1060001 1070000 5 0.245747 0.158881
5 chr1 1080001 1090000 7 0.580874 0.306382
6 chr1 1090001 1100000 6 0.624736 0.11536
10 chr1 1130001 1140000 7 0.668681 0.408978
11 chr1 1140001 1150000 4 0.625592 0.197199
12 chr1 1150001 1160000 5 0.435111 0.26577
14 chr1 1170001 1180000 17 -0.010333 0.0226575
15 chr1 1180001 1190000 10 0.496138 0.043603
16 chr1 1190001 1200000 8 0.534731 0.300696
17 chr1 1200001 1210000 5 0.0104593 0.0381257
18 chr1 1210001 1220000 1 -0.0464602 -0.0464602
19 chr1 1220001 1230000 6 0.755604 0.11035
20 chr1 1230001 1240000 4 0.127154 0.122056
21 chr1 1240001 1250000 12 0.528519 0.234784
22 chr1 1270001 1280000 1 -0.060459 -0.060459
26 chr1 1340001 1350000 3 0.0743918 0.0423312
29 chr1 1470001 1480000 30 0.454106 0.269902
31 chr1 1490001 1500000 4 0.218818 0.0282106
32 chr1 1500001 1510000 26 0.539772 0.189506
3 changes: 2 additions & 1 deletion andrew/vcf_sampler.py
Original file line number Diff line number Diff line change
Expand Up @@ -239,9 +239,10 @@ def run (passed_arguments = []):

# Reduce to only selected samples
sampled_samples = vcftools_samples[vcftools_samples.index.isin(selected_samples)]
print (sampled_samples)

# Create selected samples TSV file, with either the default filename or a user-defined filename
sampled_samples.to_csv(sampler_args.sample_file, sep = '\t')
sampled_samples.to_csv(sampler_args.sample_file, sep = '\t', float_format = '%g')

logging.info('Created selected samples file')

Expand Down

0 comments on commit ea617ee

Please sign in to comment.