# <a id='toc1_'></a>[ClinVar Variant Analysis](#toc0_)

The clinvar_variation_analysis notebook contains an analysis on ClinVar variant data

**Table of contents**<a id='toc0_'></a>    
- [ClinVar Variant Analysis](#toc1_)    
  - [Initialize](#toc1_1_)    
    - [Import necessary libraries](#toc1_1_1_)    
    - [Create output directory](#toc1_1_2_)    
    - [Import variant information file](#toc1_1_3_)    
  - [Add Supported Status of Variant based on in.vrs_xform_plan.policy](#toc1_2_)    
  - [Add Normalization Status of Variant based on out.errors](#toc1_3_)    
    - [Set Normalize Status of Variant as T/F](#toc1_3_1_)    
      - [Summary Table](#toc1_3_1_1_)    
  - [Create subgroups based on Variant Status](#toc1_4_)    
    - [Supported and Normalized Variants](#toc1_4_1_)    
    - [Supported and Not Normalized Variants](#toc1_4_2_)    
    - [Not Supported Variants](#toc1_4_3_)    
  - [Counting variants from each group](#toc1_5_)    
  - [Counting variant types for each group](#toc1_6_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

## <a id='toc1_1_'></a>[Initialize](#toc0_)

### <a id='toc1_1_1_'></a>[Import necessary libraries](#toc0_)

In [1]:
import ndjson
import pandas as pd
import numpy as np
from pathlib import Path
import gzip

### <a id='toc1_1_2_'></a>[Create output directory](#toc0_)

In [2]:
path = Path("variation_analysis_output")
path.mkdir(exist_ok=True)

### <a id='toc1_1_3_'></a>[Import variant information file](#toc0_)

In [3]:
with gzip.open("normalized-vi.json.gz", "rb") as f:
    file_content = ndjson.load(f)

In [4]:
df = pd.json_normalize(file_content)

## <a id='toc1_2_'></a>[Add Supported Status of Variant based on in.issue](#toc0_)

Checking for blanks

In [5]:
df["in.issue"].value_counts()

in.issue
No viable variation members identified.                       2093
haplotype and genotype variations are not supported.          1540
sequence for accession not supported by vrs-python release     668
repeat expressions are not supported.                          474
unsupported hgvs expression.                                   457
intronic positions are not resolvable in sequence.             272
range copies are not supported.                                 95
expression contains unbalaned paretheses.                        3
Name: count, dtype: int64

In [6]:
df["in.issue"] = df["in.issue"].fillna("None")

In [7]:
df["support_status"] = df["in.issue"].copy()

df.loc[df["support_status"] == "No viable variation members identified.", "support_status"] = False
df.loc[
    df["support_status"] == "haplotype and genotype variations are not supported.",
    "support_status",
] = False
df.loc[df["support_status"] == "sequence for accession not supported by vrs-python release", "support_status"] = False
df.loc[df["support_status"] == "repeat expressions are not supported.", "support_status"] = False
df.loc[df["support_status"] == "unsupported hgvs expression.", "support_status"] = False
df.loc[df["support_status"] == "intronic positions are not resolvable in sequence.", "support_status"] = False
df.loc[df["support_status"] == "range copies are not supported.", "support_status"] = False
df.loc[
    df["support_status"] == "expression contains unbalaned paretheses.", "support_status"
] = False
df["support_status"] = df["support_status"].fillna(False).astype(bool)

In [8]:
df["support_status"].value_counts()

support_status
True     3738092
False       5602
Name: count, dtype: int64

## <a id='toc1_3_'></a>[Add Normalization Status of Variant based on out.errors](#toc0_)

The errors are stored as a list of values, some of which are strings and other of which are dictionaries (determined by whether error was handled at the level of Variation Normalizer or after the normalizer)

The "get_errors" function extracts the text error responses for better readability and ease string processing

In [9]:
import ast

def get_errors(errors: list) -> str:
    """Takes the values for the errors and represents them as a string
    :param errors: list of errors
    :return: string representing error
    """
    if pd.isna(errors):
        return "Success"

    # Parse if it's a stringified list/dict
    if isinstance(errors, str):
        try:
            errors = ast.literal_eval(errors)
        except Exception:
            return errors  # return raw string if not a valid list or dict

    errors_out = []

    # Normalize to list
    if not isinstance(errors, list):
        errors = [errors]

    for e in errors:
        if isinstance(e, str):
            errors_out.append(e)
        elif isinstance(e, dict):
            for k, v in e.items():
                if k not in ["msg", "response-errors"]:
                    continue
                if isinstance(v, str):
                    errors_out.append(v)
                elif isinstance(v, list):
                    errors_out.extend(v)  # multiple error messages
    return "; ".join(errors_out)

In [10]:
df["error_string"] = df["out.errors"].apply(get_errors)

In [11]:
df["error_string"].value_counts()

error_string
Success                                                                                  3736937
Unable to get GRCh37/GRCh38 assembly for: NC_000023.9                                        422
Unable to get GRCh37/GRCh38 assembly for: NC_000024.8                                        392
Unable to get GRCh37/GRCh38 assembly for: NC_000002.10                                       305
Unable to get GRCh37/GRCh38 assembly for: NC_000009.10                                       279
                                                                                          ...   
Unable to find classification for: NC_000014.8:g.(?_88399357)_(88417093_88429727)del           1
Unable to find classification for: NC_000013.11:g.(32380146_32394688)_(32398771_?)del          1
Unable to find classification for: NC_000006.11:g.(?_32006191)_(32007026_32007132)del          1
Unable to find classification for: NC_000001.10:g.(?_1447541)_(1454371_1455520)del             1
Unable to tokeniz

There are Not Supported variants that have no error (marked as success inaccurately) because they were labeled "Not Supported" manually.

An error ("Not Supported") is entered manually for those variants so that they are not categorized as normalized

In [12]:
df.loc[
    (~df["support_status"]) & (df["error_string"] == "Success"),
    "error_string",
] = "Not Supported"

### <a id='toc1_3_1_'></a>[Set Normalize Status of Variant as T/F](#toc0_)

If an error is present, the variant was not normalized and therefore has a False Normalize Status

In [13]:
df["normalize_status"] = df["error_string"] == "Success"
df

Unnamed: 0,out,in.variation_id,in.name,in.vrs_class,in.range_copies,in.issue,in.variation_type,in.subclass_type,in.mappings,in.cytogenetic,...,out.state.type,out.state.length,out.state.sequence,out.state.repeatSubunitLength,out.extensions,in.absolute_copies,out.copies,support_status,error_string,normalize_status
0,,427832,NM_000785.3(CYP27B1):c.[1319_1325dupCCCACCC];[...,Not Available,[],haplotype and genotype variations are not supp...,CompoundHeterozygote,Genotype,"[{'system': 'ClinVar', 'code': '427832', 'rela...",,...,,,,,,,,False,Not Supported,False
1,,424704,NM_003977.3(AIP):c.[-125-145_-125-144delCGinsA...,Not Available,[],haplotype and genotype variations are not supp...,CompoundHeterozygote,Genotype,"[{'system': 'ClinVar', 'code': '424704', 'rela...",,...,,,,,,,,False,Not Supported,False
2,,982544,NM_000329.3(RPE65):c.[1067dup];[1543C>T],Not Available,[],haplotype and genotype variations are not supp...,CompoundHeterozygote,Genotype,"[{'system': 'ClinVar', 'code': '982544', 'rela...",,...,,,,,,,,False,Not Supported,False
3,,424730,NM_000372.4(TYR):c.[1276_1282delATGGTTC];[139G>A],Not Available,[],haplotype and genotype variations are not supp...,CompoundHeterozygote,Genotype,"[{'system': 'ClinVar', 'code': '424730', 'rela...",,...,,,,,,,,False,Not Supported,False
4,,424736,NM_005045.3(RELN):c.[2213G>A];[9427T>G],Not Available,[],haplotype and genotype variations are not supp...,CompoundHeterozygote,Genotype,"[{'system': 'ClinVar', 'code': '424736', 'rela...",,...,,,,,,,,False,Not Supported,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3743689,,3911672,NC_012920.1(MT-CO3):m.9698T>C,Allele,[],,single nucleotide variant,SimpleAllele,[],,...,LiteralSequenceExpression,,C,,,,,True,Success,True
3743690,,693215,NC_012920.1(MT-CO3):m.9751T>C,Allele,[],,single nucleotide variant,SimpleAllele,[],,...,LiteralSequenceExpression,,C,,,,,True,Success,True
3743691,,693235,NC_012920.1(MT-CO3):m.9828G>A,Allele,[],,single nucleotide variant,SimpleAllele,[],,...,LiteralSequenceExpression,,A,,,,,True,Success,True
3743692,,3049652,NM_001372044.2(SHANK3):c.1111-8C>T,Allele,[],,single nucleotide variant,SimpleAllele,[],,...,,,,,,,,True,Unable to tokenize: NG_070230.1:g.18747C>T,False


#### <a id='toc1_3_1_1_'></a>[Summary Table](#toc0_)

In the table below, the cells show the number of variants with each expected behavior and how they actually ended up performing.

If a variant was in an "expected to pass" category and ends up as text, that is an instance of a normalizer failure on a supported variant

In [14]:
summary_df = (
    df[["in.variation_id", "support_status", "in.issue", "out.type"]]
    .fillna("NONE")
    .groupby(["support_status", "in.issue", "out.type"])
    .count()
    .unstack(level=2)
    .fillna(0)
    .astype(int)["in.variation_id"]
)

In [15]:
summary_df["VariantSum"] = summary_df.sum(axis=1)

In [16]:
summary_df["NormalizedSum"] = summary_df[
    ["Allele", "CopyNumberChange", "CopyNumberCount"]
].sum(axis=1)

In [17]:
summary_df["NormalizedPercent"] = (
    summary_df["NormalizedSum"] / summary_df["VariantSum"]
).apply(lambda x: f"{round(x * 100, 2)}%")

In [18]:
summary_df = summary_df.drop(["VariantSum", "NormalizedSum"], axis=1)
summary_df

Unnamed: 0_level_0,out.type,Allele,CopyNumberChange,CopyNumberCount,NONE,NormalizedPercent
support_status,in.issue,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
False,No viable variation members identified.,0,0,0,2093,0.0%
False,expression contains unbalaned paretheses.,0,0,0,3,0.0%
False,haplotype and genotype variations are not supported.,0,0,0,1540,0.0%
False,intronic positions are not resolvable in sequence.,0,0,0,272,0.0%
False,range copies are not supported.,0,0,0,95,0.0%
False,repeat expressions are not supported.,0,0,0,474,0.0%
False,sequence for accession not supported by vrs-python release,0,0,0,668,0.0%
False,unsupported hgvs expression.,0,0,0,457,0.0%
True,,3659725,32230,39379,6758,99.82%


In [19]:
summary_df.to_csv("variation_analysis_output/variant_analysis_summary_df.csv")

## <a id='toc1_4_'></a>[Create subgroups based on Variant Status](#toc0_)

### <a id='toc1_4_1_'></a>[Supported and Normalized Variants](#toc0_)

In [20]:
supported_df = df.copy()

In [21]:
supported_df = supported_df.loc[
    (supported_df["support_status"] & supported_df["normalize_status"])
]
supported_df

Unnamed: 0,out,in.variation_id,in.name,in.vrs_class,in.range_copies,in.issue,in.variation_type,in.subclass_type,in.mappings,in.cytogenetic,...,out.state.type,out.state.length,out.state.sequence,out.state.repeatSubunitLength,out.extensions,in.absolute_copies,out.copies,support_status,error_string,normalize_status
25,,3247925,NC_000001.10:g.(?_103385847)_(103496820_?)del,CopyNumberChange,[],,Deletion,SimpleAllele,[],1p21.1,...,,,,,,,,True,Success,True
26,,832762,NC_000001.11:g.(?_119415410)_(119422630_?)del,CopyNumberChange,[],,Deletion,SimpleAllele,[],1p12,...,,,,,,,,True,Success,True
27,,3247694,NC_000001.10:g.(?_154141771)_(154148734_?)del,CopyNumberChange,[],,Deletion,SimpleAllele,[],1q21.3,...,,,,,,,,True,Success,True
28,,1457138,NC_000001.10:g.(?_156084710)_(156085085_?)del,CopyNumberChange,[],,Deletion,SimpleAllele,[],1q22,...,,,,,,,,True,Success,True
29,,3247637,NC_000001.10:g.(?_156830727)_(156836790_?)del,CopyNumberChange,[],,Deletion,SimpleAllele,[],1q23.1,...,,,,,,,,True,Success,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3743688,,693181,NC_012920.1(MT-CO3):m.9549C>T,Allele,[],,single nucleotide variant,SimpleAllele,"[{'system': 'dbSNP', 'code': '693181', 'relati...",,...,LiteralSequenceExpression,,T,,,,,True,Success,True
3743689,,3911672,NC_012920.1(MT-CO3):m.9698T>C,Allele,[],,single nucleotide variant,SimpleAllele,[],,...,LiteralSequenceExpression,,C,,,,,True,Success,True
3743690,,693215,NC_012920.1(MT-CO3):m.9751T>C,Allele,[],,single nucleotide variant,SimpleAllele,[],,...,LiteralSequenceExpression,,C,,,,,True,Success,True
3743691,,693235,NC_012920.1(MT-CO3):m.9828G>A,Allele,[],,single nucleotide variant,SimpleAllele,[],,...,LiteralSequenceExpression,,A,,,,,True,Success,True


In [22]:
variation_type_count_supported_df = supported_df.value_counts(
    ["in.variation_type", "in.issue"]
).reset_index()
variation_type_count_supported_df

Unnamed: 0,in.variation_type,in.issue,count
0,single nucleotide variant,,3388754
1,Deletion,,159681
2,Duplication,,73241
3,Microsatellite,,36028
4,copy number gain,,21484
5,copy number loss,,20367
6,Indel,,17018
7,Insertion,,12982
8,Inversion,,1401
9,Variation,,379


In [23]:
variation_type_count_supported_df.to_csv(
    "variation_analysis_output/variation_type_count_supported_df.csv"
)

### <a id='toc1_4_2_'></a>[Supported and Not Normalized Variants](#toc0_)

In [24]:
supported_not_normalized_df = df.copy()

In [25]:
supported_not_normalized_df = supported_not_normalized_df.loc[
    (
        supported_not_normalized_df["support_status"]
        & ~supported_not_normalized_df["normalize_status"]
    )
]
supported_not_normalized_df

Unnamed: 0,out,in.variation_id,in.name,in.vrs_class,in.range_copies,in.issue,in.variation_type,in.subclass_type,in.mappings,in.cytogenetic,...,out.state.type,out.state.length,out.state.sequence,out.state.repeatSubunitLength,out.extensions,in.absolute_copies,out.copies,support_status,error_string,normalize_status
24,,1174529,NC_000001.10:g.(103388956_103400026)_(10409439...,CopyNumberChange,[],,Deletion,SimpleAllele,[],1p21.1,...,,,,,,,,True,Unable to find classification for: NC_000001.1...,False
558,,254090,NM_000251.2(MSH2):c.-125_1076+?del,CopyNumberChange,[],,Deletion,SimpleAllele,[],2p21,...,,,,,,,,True,Unable to find classification for: NC_000002.1...,False
1147,,987857,NC_000007.13:g.(?_6010555)_(6027252_6029430)del,CopyNumberChange,[],,Deletion,SimpleAllele,[],7p22.1,...,,,,,,,,True,Unable to find classification for: NC_000007.1...,False
2377,,236334,Single allele,CopyNumberChange,[],,Deletion,SimpleAllele,[],16p13.3-13.2,...,,,,,,,,True,Unable to get GRCh37/GRCh38 assembly for: NC_0...,False
2379,,974178,NC_000016.10:g.(89752223_89758576)_(89816657_?...,CopyNumberChange,[],,Deletion,SimpleAllele,[],16q24.3,...,,,,,,,,True,Unable to find classification for: NC_000016.9...,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3676061,,399136,NCBI36/hg18 Yq11.223(chrY:23282395-23906316)x1,CopyNumberCount,[],,copy number loss,SimpleAllele,[],Yq11.223,...,,,,,,1,,True,Unable to get GRCh37/GRCh38 assembly for: NC_0...,False
3676062,,401800,NCBI36/hg18 Yq11.223(chrY:24283222-24691810)x1,CopyNumberCount,[],,copy number loss,SimpleAllele,[],Yq11.223,...,,,,,,1,,True,Unable to get GRCh37/GRCh38 assembly for: NC_0...,False
3676063,,400073,NCBI36/hg18 Yq11.223-11.23(chrY:25252100-25583...,CopyNumberCount,[],,copy number loss,SimpleAllele,[],Yq11.223-11.23,...,,,,,,1,,True,Unable to get GRCh37/GRCh38 assembly for: NC_0...,False
3676064,,398808,NCBI36/hg18 Yq11.23(chrY:26218479-26497939)x1,CopyNumberCount,[],,copy number loss,SimpleAllele,[],Yq11.23,...,,,,,,1,,True,Unable to get GRCh37/GRCh38 assembly for: NC_0...,False


In [26]:
variation_type_count_supported_not_normalized_df = (
    supported_not_normalized_df.value_counts(
        ["in.variation_type", "in.issue"]
    ).reset_index()
)
variation_type_count_supported_not_normalized_df

Unnamed: 0,in.variation_type,in.issue,count
0,copy number gain,,3245
1,copy number loss,,2243
2,Deletion,,779
3,Duplication,,407
4,single nucleotide variant,,80
5,Indel,,2
6,Insertion,,1


In [27]:
variation_type_count_supported_not_normalized_df.to_csv(
    "variation_analysis_output/variation_type_count_supported_not_normalized_df.csv"
)

### <a id='toc1_4_3_'></a>[Not Supported Variants](#toc0_)

In [28]:
not_supported_df = df.copy()

In [29]:
not_supported_df = not_supported_df.loc[
    ~not_supported_df["support_status"] & ~not_supported_df["normalize_status"]
]
not_supported_df

Unnamed: 0,out,in.variation_id,in.name,in.vrs_class,in.range_copies,in.issue,in.variation_type,in.subclass_type,in.mappings,in.cytogenetic,...,out.state.type,out.state.length,out.state.sequence,out.state.repeatSubunitLength,out.extensions,in.absolute_copies,out.copies,support_status,error_string,normalize_status
0,,427832,NM_000785.3(CYP27B1):c.[1319_1325dupCCCACCC];[...,Not Available,[],haplotype and genotype variations are not supp...,CompoundHeterozygote,Genotype,"[{'system': 'ClinVar', 'code': '427832', 'rela...",,...,,,,,,,,False,Not Supported,False
1,,424704,NM_003977.3(AIP):c.[-125-145_-125-144delCGinsA...,Not Available,[],haplotype and genotype variations are not supp...,CompoundHeterozygote,Genotype,"[{'system': 'ClinVar', 'code': '424704', 'rela...",,...,,,,,,,,False,Not Supported,False
2,,982544,NM_000329.3(RPE65):c.[1067dup];[1543C>T],Not Available,[],haplotype and genotype variations are not supp...,CompoundHeterozygote,Genotype,"[{'system': 'ClinVar', 'code': '982544', 'rela...",,...,,,,,,,,False,Not Supported,False
3,,424730,NM_000372.4(TYR):c.[1276_1282delATGGTTC];[139G>A],Not Available,[],haplotype and genotype variations are not supp...,CompoundHeterozygote,Genotype,"[{'system': 'ClinVar', 'code': '424730', 'rela...",,...,,,,,,,,False,Not Supported,False
4,,424736,NM_005045.3(RELN):c.[2213G>A];[9427T>G],Not Available,[],haplotype and genotype variations are not supp...,CompoundHeterozygote,Genotype,"[{'system': 'ClinVar', 'code': '424736', 'rela...",,...,,,,,,,,False,Not Supported,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3676071,,3778638,"MON1A, ARG249CYS (rs555274206)",Unknown,[],No viable variation members identified.,single nucleotide variant,SimpleAllele,[],,...,,,,,,,,False,Not Supported,False
3676072,,10563,"F9, IVS1, 192A-G",Unknown,[],No viable variation members identified.,single nucleotide variant,SimpleAllele,[],Xq27.1-q27.2,...,,,,,,,,False,Not Supported,False
3676073,,3901867,"ATP2A2, GLU373LYS",Unknown,[],No viable variation members identified.,single nucleotide variant,SimpleAllele,[],,...,,,,,,,,False,Not Supported,False
3676074,,3901246,"CST3, IVS2DS, G-T, +1",Unknown,[],No viable variation members identified.,single nucleotide variant,SimpleAllele,[],,...,,,,,,,,False,Not Supported,False


In [30]:
variation_type_count_not_supported_df = not_supported_df.value_counts(
    ["in.variation_type", "in.issue"]
).reset_index()
variation_type_count_not_supported_df

Unnamed: 0,in.variation_type,in.issue,count
0,Haplotype,haplotype and genotype variations are not supp...,617
1,Diplotype,haplotype and genotype variations are not supp...,596
2,Deletion,No viable variation members identified.,593
3,Microsatellite,repeat expressions are not supported.,456
4,Deletion,sequence for accession not supported by vrs-py...,336
5,CompoundHeterozygote,haplotype and genotype variations are not supp...,297
6,single nucleotide variant,No viable variation members identified.,290
7,Translocation,No viable variation members identified.,282
8,Insertion,No viable variation members identified.,264
9,Deletion,intronic positions are not resolvable in seque...,220


In [31]:
variation_type_count_not_supported_df.to_csv(
    "variation_analysis_output/variation_type_count_not_supported_df.csv"
)

Sanity check: making sure there are no supported variants that have been marked as normalized

In [32]:
not_supported_but_normalized_df = df.copy()

In [33]:
not_supported_but_normalized_df = not_supported_but_normalized_df.loc[
    (
        ~not_supported_but_normalized_df["support_status"]
        & not_supported_but_normalized_df["normalize_status"]
    )
]
not_supported_but_normalized_df

Unnamed: 0,out,in.variation_id,in.name,in.vrs_class,in.range_copies,in.issue,in.variation_type,in.subclass_type,in.mappings,in.cytogenetic,...,out.state.type,out.state.length,out.state.sequence,out.state.repeatSubunitLength,out.extensions,in.absolute_copies,out.copies,support_status,error_string,normalize_status


## <a id='toc1_5_'></a>[Counting variants from each group](#toc0_)

In [34]:
num_supported = len(supported_df)
num_supported_not_normalized = len(supported_not_normalized_df)
num_not_supported_but_normalized = len(not_supported_but_normalized_df)
num_not_supported = len(not_supported_df)

In [35]:
summary_df2 = pd.DataFrame(
    {
        "Supported": [num_supported, num_supported_not_normalized],
        "Not Supported": [num_not_supported_but_normalized, num_not_supported],
    }
)

In [36]:
summary_df2.index = ["Normalized", "Not Normalized"]
summary_df2

Unnamed: 0,Supported,Not Supported
Normalized,3731335,0
Not Normalized,6757,5602


## <a id='toc1_6_'></a>[Counting variant types for each group](#toc0_)

In [37]:
variation_type_count_summary_df = pd.merge(
    pd.merge(
        variation_type_count_supported_df,
        variation_type_count_supported_not_normalized_df,
        on="in.variation_type",
        how="left",
    ),
    variation_type_count_not_supported_df,
    on="in.variation_type",
    how="right",
)
variation_type_count_summary_df = variation_type_count_summary_df.replace(
    np.nan, 0, regex=True
)

In [38]:
variation_type_count_summary_df = variation_type_count_summary_df.rename(
    columns={
        "in.id_x": "supported",
        "in.id_y": "supported_not_normalized",
        "in.id": "not_supported",
    }
)

In [39]:
variation_type_count_summary_df.to_csv(
    "variation_analysis_output/variation_type_count_summary_df.csv"
)
variation_type_count_summary_df

Unnamed: 0,in.variation_type,in.issue_x,count_x,in.issue_y,count_y,in.issue,count
0,Haplotype,0.0,0.0,0.0,0.0,haplotype and genotype variations are not supp...,617
1,Diplotype,0.0,0.0,0.0,0.0,haplotype and genotype variations are not supp...,596
2,Deletion,,159681.0,,779.0,No viable variation members identified.,593
3,Microsatellite,,36028.0,0.0,0.0,repeat expressions are not supported.,456
4,Deletion,,159681.0,,779.0,sequence for accession not supported by vrs-py...,336
5,CompoundHeterozygote,0.0,0.0,0.0,0.0,haplotype and genotype variations are not supp...,297
6,single nucleotide variant,,3388754.0,,80.0,No viable variation members identified.,290
7,Translocation,0.0,0.0,0.0,0.0,No viable variation members identified.,282
8,Insertion,,12982.0,,1.0,No viable variation members identified.,264
9,Deletion,,159681.0,,779.0,intronic positions are not resolvable in seque...,220
