Running a single sample #177

josenimo · 2024-01-22T14:52:02Z

Thank you for this amazing tool, a couple of questions regarding Cylinter.

I am running MCMICRO using mesmer, so I have to adapt a little bit to conform the input file structure.

First, what is the structure of the seg files, I remember that these come from the S3segmenter QC folder. from I could explore, the file is made up of two channels, the first channel being a binary segmentation mask, 0 for background, and 42594 for foreground. The inside of the cells is also 0, meaning the cells are not filled (as I would expect). The second channel seems to be the nuclear channel used for the segmentation step.
How critical is this file? I will try to replicate exactly what I find in the example files. Any detail that matters for downstream?

Second, can I run a single sample? I am just trying to go through cylinter for a single sample. However a KeyError: 0 shows up at the aggregateData step, which I assume is not necesary for a single sample. Is there a way to bypass the single sample?

StackTrace below:

(cylinter-env47) CMP06623:P23_Core19_PoC jnimoca$ cylinter Cylinter/config.yml 
INFO: Reading configuration file
INFO: Executing pipeline
Running: <function aggregateData at 0x1550615a0>
INFO: ======================================================================
INFO: RUNNING MODULE: aggregateData

Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cylinter-env47/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3790, in get_loc
    return self._engine.get_loc(casted_key)
  File "index.pyx", line 152, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 181, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 2606, in pandas._libs.hashtable.Int64HashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 2630, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cylinter-env47/bin/cylinter", line 10, in <module>
    sys.exit(main())
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cylinter-env47/lib/python3.10/site-packages/cylinter/cylinter.py", line 49, in main
    pipeline.run_pipeline(config, args.module)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cylinter-env47/lib/python3.10/site-packages/cylinter/pipeline.py", line 132, in run_pipeline
    data = module(data, qc, config)  # getattr(qc, module)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cylinter-env47/lib/python3.10/site-packages/cylinter/components.py", line 56, in wrapper
    result = func(*args, **kwargs)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cylinter-env47/lib/python3.10/site-packages/cylinter/modules/aggregateData.py", line 16, in aggregateData
    markers, dna1, dna_moniker, abx_channels = read_markers(
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cylinter-env47/lib/python3.10/site-packages/cylinter/utils.py", line 321, in read_markers
    dna1 = markers['marker_name'][markers['channel_number'] == markers['channel_number'].min()][0]
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cylinter-env47/lib/python3.10/site-packages/pandas/core/series.py", line 1040, in __getitem__
    return self._get_value(key)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cylinter-env47/lib/python3.10/site-packages/pandas/core/series.py", line 1156, in _get_value
    loc = self.index.get_loc(label)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/cylinter-env47/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3797, in get_loc
    raise KeyError(key) from err
KeyError: 0

Config file here, as .txt:
config.txt

Thank you again
Best,
Jose

The text was updated successfully, but these errors were encountered:

josenimo · 2024-01-23T11:56:32Z

Dear @gjbaker ,

I figured out what is happening. It has nothing to do with it being a single sample.
It fails at this line of code:
dna1 = markers['marker_name'][markers['channel_number'] == markers['channel_number'].min()][0]
https://github.com/josenimo/cylinter/blob/0e1cb69bec85ca5f603f8df920f5d7b5a1e5b1be/cylinter/utils.py#L321

the [0] calls the channel name with index 0, the issue in my case is that I am excluding 5 channels, and therefore the lowest channel number has the index 5. So when I try to call the [0] it cannot find any and it runs into a KeyError.

I also have another problem which is that my nuclear stain is not on the first channel, it is actually the last channel. This means that .min() would still not work for me. Do you think I could be an easy way for me to specify the channel name as a string in the config.yml? This would be nice.

Best,
Jose

gjbaker · 2024-01-23T12:47:50Z

Hi @josenimo,

The line of code you shared looks for the name of the channel with the lowest index in the channel_number column of the markers.csv file. In our lab's standard workflow, this index almost always corresponds to the first counterstain channel which coincides with the first channel in the OME-TIFF file. The [0] at the end of the line just extracts the column value (i.e., channel name) from the single-row dataframe returned by the pandas slicing operation.

I assume you are using the markersToExclude parameter in config.yml to exclude the 5 channels. Note that for cyclic studies this should only include immunomarker channels, not counterstain channels, as these are used by the cycleCorrelation module to correlate counterstain signals across imaging cycles in order to drop cells that have failed to remain stable over the course of imaging.

A relatively easy work around for your counterstain channel being the last channel in the image is to reassign the dna1 variable in utils.py to explicitly take the name of the channel you wish to consider as your default DNA channel.

Regarding your earlier question about the form and function of seg files. These are binary images showing segmentation contours which are used by the program as a fiducial for evaluating segmentation quality. In the case of seg files output by MCMICRO, these consist of two channels, as you have previously alluded to. That said, CyLinter only reads the binary channel, so this is the only channel one would actually need to recreate.

Hope this helps,
-Greg

josenimo · 2024-01-24T13:42:15Z

Hey @gjbaker ,
Thank you for time and explanation, that does make sense.

I am hard-coding a specific channel name to the my counterstain.
small question:
Would I be able to modify the python code inside the mamba environment, I just can't find it and assume it is compiled as a binary. Would I be able to create a new environment with all the requirements and then install from modified cloned repo?

Thank you again,
Best,
Jose

gjbaker · 2024-01-24T14:14:18Z

Hi @josenimo,

Assuming you are on a Mac, all one would have to do is update the read_markers function in the file found here: ~/miniconda3/envs/cylinter/lib/python3.10/site-packages/cylinter/utils.py (or an analogous path) with the default DNA channel you wish to use.

-Greg

josenimo · 2024-01-24T14:58:29Z

Hey @gjbaker ,

thank you, I had never hacked into my conda environments :)

I noticed that the channel being loaded was still being the first one,
I realized that this line of code was the guilty one selectROI.py line 293
I just switched it to the correct one (4 in my case), I guess I might have to change this for all other modules but that should be a simple fix. Just writing it here for documentation.

Thanks again, I will close the issue now.

josenimo closed this as completed Jan 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running a single sample #177

Running a single sample #177

josenimo commented Jan 22, 2024

josenimo commented Jan 23, 2024

gjbaker commented Jan 23, 2024

josenimo commented Jan 24, 2024

gjbaker commented Jan 24, 2024

josenimo commented Jan 24, 2024

Running a single sample #177

Running a single sample #177

Comments

josenimo commented Jan 22, 2024

josenimo commented Jan 23, 2024

gjbaker commented Jan 23, 2024

josenimo commented Jan 24, 2024

gjbaker commented Jan 24, 2024

josenimo commented Jan 24, 2024