You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/filtering_vcf.md
+16-16Lines changed: 16 additions & 16 deletions
Original file line number
Diff line number
Diff line change
@@ -1,14 +1,14 @@
1
-
#Variant Record and Sample Selection
1
+
#Filtering your VCF file: Variant Record and Sample Selection
2
2
3
-
##General Note:s: Extracting and Reshaping VCF Data
3
+
##General Notes: Extracting and Reshaping VCF Data
4
4
5
5
VIVA supports flexible filters for selecting variant records for visualization.
6
6
7
7
Additionally, the tool supports selecting and grouping samples by common traits for visualization.
8
8
9
9
Grouping samples is particularly useful for exploring phenotypic and genotypic associations, displaying differential distribution of variants between groups of samples, and identifying batch effect on coverage between groups of samples in variant analysis experiments.
10
10
11
-
##Choose a VCF file to Visualize *REQUIRED*
11
+
##Choose a VCF file to Visualize *REQUIRED*
12
12
13
13
Specify filename of VCF file.
14
14
@@ -19,16 +19,16 @@ Specify filename of VCF file.
19
19
*Note*: This is the *only required argument* for VIVA. If you run with none of the other options, default options will be used. These default options are described in detail below.
20
20
21
21
```
22
-
julia viva -f example.vcf [OPTIONS]
22
+
viva -f example.vcf [OPTIONS]
23
23
```
24
24
25
-
##Selecting Variant Records
25
+
##Selecting Variant Records
26
26
27
27
VIVA offers three filters for selecting variant records to visualize from VCF files.
28
28
29
29
It is recommended to use one or a combination of these filters to reduce the number of variant records extracted from the VCF for plotting. This is recommended for reasons related to technical limitations and practical visual interpretation. The number of variant records able to be plotted is limited by both the user's available computing resources as well as the number of pixels in their display for displaying data points. While it is possible to visualize many thousands of variant records at one time with VIVA, **we recommend visualizing fewer than 2000 variants** so that all data points can be displayed that your computing resources are not overburdened. However, VIVA is capable of extracting and plotting hundreds of thousands of data points from VCF files.
30
30
31
-
###Genomic range
31
+
###Genomic range
32
32
33
33
Select rows within a given genomic range.
34
34
@@ -39,10 +39,10 @@ Select rows within a given genomic range.
39
39
*Note*: To visualize genomic ranges within multiple chromosomes, you may create a batch script to run VIVA multiple times using different genomic ranges.
40
40
41
41
```
42
-
julia viva -f example.vcf -r chr1:20000-30000000
42
+
viva -f example.vcf -r chr1:20000-30000000
43
43
```
44
44
45
-
###Variant list
45
+
###Variant list
46
46
47
47
Select variants matching list of chromosomal positions.
48
48
@@ -51,10 +51,10 @@ Select variants matching list of chromosomal positions.
51
51
*arguments*: Provide filename of text file formatted with two columns in .csv format as an argument. There should be a header row with "chr" and "start" in row 1 of column 1 and 2 respectively. Column 1 should contain chromosome number in the format "chr1" or "1" and should match the syntax of the VCF file (that is, if the VCF file lists chromosome numbers in the form "chrX", use "chrX" in your positions list, not "X") You can find an example of this file [here]("[here]("https://github.com/compbiocore/VariantVisualization.jl/tree/master/test/test_files/positions_list.csv")")
52
52
53
53
```
54
-
julia viva -f example.vcf -l "example_positions_list.txt"
Select rows that passed filters originally set during variant calling and VCF file generation. Selects records with "PASS" in the FILTER column of the VCF file. This filter alone is often not stringent enough to reduce the number of variants for plotting and visual interpretation. For analyzing large VCF files with many "passed" filter records, use genomic range,
60
60
@@ -63,12 +63,12 @@ Select rows that passed filters originally set during variant calling and VCF fi
63
63
*arguments*: This flag is a positional argument and does not take options.
64
64
65
65
```
66
-
julia viva -f example.vcf -p
66
+
viva -f example.vcf -p
67
67
```
68
68
69
-
##Selecting and Grouping Samples
69
+
##Selecting and Grouping Samples
70
70
71
-
###Group samples by sample metadata traits
71
+
###Group samples by sample metadata traits
72
72
73
73
Group sample columns using your sample metadata and visualize metadata attributes in a colorbar above heatmap visualizations.
74
74
@@ -94,10 +94,10 @@ Metadata traits are stored as rownames in the first column of the table and shou
94
94
This matrix should be saved as a comma delimited .csv file. Microsoft Excel is commonly used for this purpose, but sometimes creates extra delimiter characters in the output file that produce an error in VIVA. You can check to make sure the .csv file was saved properly by opening the file with a text editor such as BBEdit to inspect for and delete empty values or extra delimiter characters at the end of each row.
95
95
96
96
```
97
-
julia viva -f example.vcf -g sample_metadata_matrix.csv case,control
Select specific samples to be extracted from the VCF for visualization.
103
103
@@ -108,5 +108,5 @@ Select specific samples to be extracted from the VCF for visualization.
108
108
*Note*: To use the sample selection feature in combination with the sample grouping feature, the sample metadata matrix must only contain the sample ids to be selected.
109
109
110
110
```
111
-
julia viva -f example.vcf --select_samples select_samples_list.txt
0 commit comments