DOC: Example on how to parse comments

As a bioinformatician, I frequently use Pandas to parse hideous scientific file formats. Often standard biological file formats include a section of header comments that provide useful information about the file, such as reference build, species, versioning, etc. 

For example, a typical VCF file: 

    #VCF v1.1.4
    #RefBuild: hg19
    #Assay:Oncopanel
    #CHROM    POS    REF    ALT    QUAL    FILTER    INFO
    22            123845    A      G          .             .           STRAND=+

Above, we'd need to know the RefBuild to determine which species and genome version we are working with, and Assay to let us know which assay of mutations that we are looking for. There are all strange manner of things found in these headers that is often important to our analysis. 

Currently, I would need to read in the file once without Pandas to grab the header information, and then read it in again with pandas to skip the commented lines and turn the content into a dataframe. I primarily build pipelines for processing very large datasets and this little workaround is often the bane of my existence. 

I would like to put in a feature request to add this ability. [Similar questions/requests already exist on StackOverflow, as well](https://stackoverflow.com/questions/39724298/pandas-extract-comment-lines) Thank you. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC: Example on how to parse comments #22055

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

DOC: Example on how to parse comments #22055

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions