Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TACA demultiplexing need to use Undetermined reads #112

Open
vezzi opened this issue May 19, 2015 · 7 comments
Open

TACA demultiplexing need to use Undetermined reads #112

vezzi opened this issue May 19, 2015 · 7 comments
Assignees

Comments

@vezzi
Copy link

vezzi commented May 19, 2015

No description provided.

@vezzi vezzi added this to the Demultiplex all the things milestone May 19, 2015
@vezzi vezzi assigned vezzi and Galithil and unassigned vezzi May 19, 2015
@vezzi
Copy link
Author

vezzi commented May 19, 2015

@guillermo-carrasco and @Galithil (I add @parlundin and @senthil10 for knowledge and suggestions) this issue is about the use of Undetermined reads in Xten after demultiplexing.
You need to coordinate on which pat of the code work, but this is a part the needs to be plugged in after demultiplexing and before sending data to Nestor, so it should not difficult to work in separation and then merge your work.

Pseudo code after demultiplexing

  • if FC is pooled (i.e., there is at least one lane with more than one index)
    • foreach lane in FC
    • if lane.undetermine_prop_for_pooled_lane:
      • continue
    • else:
      • stop working on this FC, mail the user
  • else:
    • foreach lane in FC
    • if lane.undetermine_prop_for_lane_with_single_sample:
      • copy Undetermined reads in correct Sample folder
    • else:
      • stop working on this FC, mail the user
  • send FC to nestor
  • start pipeline on Nestor

the conditions (subject to changes are):

  • lane.undetermine_prop_for_pooled_lane
    • % Undetermined < 10%
    • the most frequent undetermined index occurs less than 40% (why 40%? good question I am guessing)
  • lane.undetermine_prop_for_lane_with_single_sample
    • % Undetermined < 10%
    • the most frequent undetermined index occurs less than 40%
    • Q30 must be higher than 75%

the last point can be applied also to lane.undetermine_prop_for_pooled_data, however I think that for now we have experience only with the second case, so we need to see some of these demultiplexed flowcells before take some decision.

@senthil10
Copy link
Member

hej @vezzi I have a question, whether all the projects run on a Hiseq-X will be process/analyzed/delivered from Nestor even though the project is not a IGN project ?

A suggestion for pseudo code (mostly what you said) :)

for lane in FC:
    lane_type = 'pooled' if lane.is_pooled else 'single'
    if lane.ok_to_proceed(type=lane_type):
        Bingo!! Proceed!!
    else:
        Stop and Mail user

def lane.ok_to_proceed(type):
     if undetermined > 10% and top_index > 40%:
        return False
     if type = single and Q30 < 75%:
        return False
    return True

@Galithil
Copy link

you definitely should add @remiolsen as he did work on that specific point for Hiseq runs.

@vezzi
Copy link
Author

vezzi commented May 19, 2015

@senthil10 HiSeqX FC are transferred and processed only in Nestor, and delivered to Milou. IGN projects were a bit in the middle, they were moved to milou, and from milou to nestor to be processed.

@guillermo-carrasco
Copy link

@Galithil is this fixed with one of your latest PRs?

@Galithil
Copy link

Galithil commented Jun 2, 2015

it should be done, yes. onsite testing would be nice though.

On Tue, Jun 2, 2015 at 2:52 PM, Guillermo Carrasco <notifications@github.com

wrote:

@Galithil https://github.com/Galithil is this fixed with one of your
latest PRs?


Reply to this email directly or view it on GitHub
#112 (comment).

@guillermo-carrasco
Copy link

👍 cool, close this when you're sure that is working then please :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants