Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Same sample ID on multiple lanes causes error #32

Closed
reisingerf opened this issue Feb 13, 2018 · 9 comments
Closed

Same sample ID on multiple lanes causes error #32

reisingerf opened this issue Feb 13, 2018 · 9 comments

Comments

@reisingerf
Copy link

Hi, I have a sample sheet that has the same sample ID but on multiple lanes, which in our case can happen quite frequently. This case is currently not supported. Could this be added?

Example [Data] section:

[Data],,,,,,,
Lane,Sample_ID,Sample_Name,Sample_Plate,Sample_Well,I7_Index_ID,index,Sample_Project,Description
1,WES013BL,,,,A010,TAGCTT,,
1,WES013FR,,,,A027,ATTCCT,,
1,MDx150891,,,,A012,CTTGTA,,
1,MDx150892,,,,A016,CCGTCC,,
2,WES013BL,,,,A010,TAGCTT,,
2,WES013FR,,,,A027,ATTCCT,,
2,MDx150891,,,,A012,CTTGTA,,
2,MDx150892,,,,A016,CCGTCC,,
@clintval
Copy link
Owner

Yes definitely. Thanks for submitting this issue. I will add this in for the next release.

@reisingerf
Copy link
Author

Thanks! Great work!
Your lib saved me quite some coding!

@clintval
Copy link
Owner

clintval commented Feb 13, 2018

Awesome to hear! It's saved me quite a bit of effort too.

Please keep the suggestions coming, I only use sample sheets for a very specific task and will need the community's help in making this useful across applications.

@reisingerf
Copy link
Author

Sure, I will report if we have any other issues/suggestions.

@PertuyF
Copy link

PertuyF commented Feb 13, 2018

@clintval , beware that according to Illumina's reference the Sample_ID must be a unique identifier.

At a minimum, the one column that is universally required is Sample_ID,
which provides a unique string identifier for each sample.

So I assume you should only allow duplicates if the Lane column is provided, and if its values are different for a given Sample_ID.

@reisingerf , in your case I am not sure the Lane column is necessary, as documentation from recent version of bcl2fastq mentions:

When the Lane column of the sample sheet Data section is populated, only those lanes are converted. When the Lane column is not used, all lanes are converted.

Except if you use it to extract data only for specific lanes from a larger flowcell, I guess.

@clintval
Copy link
Owner

clintval commented Feb 14, 2018

Thanks @PertuyF. I recognize some Illumina sequencers may allow per lane loading which would support the notion that you could technically have identical Sample_ID on the same flowcell albeit on different lanes with the same sample indexes.

I am willing to be permissive on the specification instead of restrictive since sample sheets are used by more platforms than just Illumina (e.g.. 10x).

I am open to discussion on how permissive this library should be to restricting the import of sample sheets.

Let me dwell on this a bit and I will respond back.

@reisingerf, feel free to comment on your specific application and need for this feature. I am interested in the applications you are pursuing.

@reisingerf
Copy link
Author

We are sequencing cancer samples with using Illumina's NovaSeq. We have a few reasons:
When you specify a lane in the sample sheet for the same sample ID, it generates SAMPLE_S([0-9]+)_L00[1-8]_R[1-2]_001.fastq.gz This helps to trace and identify the source of FASTQ back to the lane, (this info is also in the read header, but it's much easier to just look at the FASTQ file name).
In some cases it may be inevitable to specify lane number due to logistic restrictions when we need to load by lane, either because we don’t have enough indexes or we need to add up to the desired coverage.

clintval added a commit that referenced this issue Feb 19, 2018
@clintval
Copy link
Owner

Hi @reisingerf, I implemented the feature and made a new release as v0.4.0. Let me know how it works for you. I did demo your sample sheet snippet with success.

Feel free to update/install from PyPi:

$ pip install sample_sheet

@reisingerf
Copy link
Author

Great thanks!
Works fine now for my use cases!
Great job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants