Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comparison cell numbers in flowjo vs import using flowWorkspace #256

Closed
staedlern opened this issue Aug 22, 2018 · 4 comments
Closed

comparison cell numbers in flowjo vs import using flowWorkspace #256

staedlern opened this issue Aug 22, 2018 · 4 comments

Comments

@staedlern
Copy link

Dear Greg and Mike

I am trying to do a small comparison of cell cluster assignment as obtained via manual gating vs clustering using flowSOM (e.g. by F1 score). To do so I intend to proceed as follows:

  1. manual gating is available in flowjo as wsp workspace
  2. import wsp using flowWorkspace package
  3. get single cell expression data for manual gating populations p_1,...,p_K via calls ex_k=fsApply(getData(gs,y=p_k),flowCore::exprs)
  4. stack all ex_k, k=1,...K together in order to get one expression matrix ex_all with manual gating labels
  5. run flowSOM on ex_all and retrieve flowSOM labels
  6. compare flowSOM labels with manual gating labels

What are you general thoughts on these steps?

So far I observed the following problems:

  1. the expressions which I get via parseWorkspace are already transformed. However, I would like to work with the compensated but untransformed data. I know that I could choose execute=FALSE but then I loose the gating and compensation. Is there a way around this?
  2. the number of events(=cells) by e.g. calling fs <- getData(gs,y="/CD45+/Single Cells-1/Single Cells"); dim(e_fs <- fsApply(fs, flowCore::exprs)) is different from the number of events in flowJo for the same population. Do you know what the reason could be?

Any kind of help would be very much appreciated.

Thanks a lot!

Best regards,

Nicolas

@gfinak
Copy link
Member

gfinak commented Aug 22, 2018

The numbers in the gates may differ by a few cells here and there. Generally this has no impact on downstream inference. Unless there is a large difference, which we can investigate, it is likely expected behavior.

You can back transform the data, @mikejiang can provide details, I can never recall how.

To get around the cell count issue, call getData on the root node to get all the cells and use getIndices for each gate to get a Boolean vector of 0/1 indicators for each cell that specifies if the cell is in or not in that gate.

@mikejiang
Copy link
Member

@gfinak getIndices should be consistent with the openCyto.count, thus will potentially have small difference from xml.count.

Regarding to retrieving single cell expression data, there is dedicated API getSingleCellExpression (I can give you more information if you decide to go this route) for this task. The potential difference from your approach would be it doesn't duplicated the cells (rows) in the final data matrix as your method might do. Depending on whether or not your P_1, P_2... have overlapped areas.

Once your have the final merged fs, to transform it back to raw scale, here is an example

#retreive the inverse transformation from the first sample(assuming all sample use the same transformation parameters)
trans <- getTransformations(gs[[1]], inverse = T)
fs_raw <- transform(fs, trans)

@staedlern
Copy link
Author

Thank you both for your help!

@gfinak
Just to confirm: it is normal to "loose" a few cells when parsing flowjo workspace? In my example I get n= 4114001 cells by calling
fs <- getData(gs) singlecell <- unlist(lapply(gs,getIndices,y="Single Cells")) table(singlecell)[2]
but n=4113769 by exporting the respective population from FlowJo and reading it using read.flowSet. So roughly 250 cells lost...

@mikejiang
I tried your example with the transformation...I get the error

Error in .local(_data, ...) :
names of 'translist' must be consistent with flow data!

when running the following:

fs <- getData(gs) transf <- getTransformations(gs[[1]],inverse=TRUE) fsraw <- transform(fs[,names(transf)],transf)

@mikejiang
I read the help page of getSingleCellExpression. It would be great if you could give me more info on how you would tackle it via getSingleCellExpression. Basically I would like to get back from flowJo workspace the raw expression matrix but with each cell annotated whether it is in P_1, P_2,.... (assuming P_1, P_2 are non-overlapping)

@mikejiang
Copy link
Member

mikejiang commented Aug 23, 2018

Try to add this

transf <- transformList(transf)

before you transform the data
I saw you intended to inverse transform the entire data.

fs <- getData(gs)

If so, don't forget to assign it back to gs

flowData(gs) <- fs_inverse

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants