Skip to content

Conversation

@bzrry
Copy link
Contributor

@bzrry bzrry commented Sep 17, 2025

i had the crash below due to indices being floats rather than ints. initializing the empty array to ints solved the issue.

Seed set to 12
/mnt/vcc-data/competition_support_set/{competition_train,k562_gwps,rpe1,jurkat,k562,hepg2}.h5
Dataset path does not exist: /mnt/vcc-data/competition_support_set/{competition_train,k562_gwps,rpe1,jurkat,k562,hepg2}.h5
Processed competition_train: 221273 train, 0 val, 0 test
Processing replogle_h1:  17%|███▌                 | 1/6 [00:00<00:01,  2.65it/s]No cell barcode information found in /mnt/vcc-data/competition_support_set/k562_gwps.h5. Generating generic barcodes.
Processed k562_gwps: 111605 train, 0 val, 0 test
Processing replogle_h1:  33%|███████              | 2/6 [00:00<00:00,  4.46it/s]No cell barcode information found in /mnt/vcc-data/competition_support_set/rpe1.h5. Generating generic barcodes.
Processed rpe1: 22317 train, 0 val, 0 test
Processing replogle_h1:  33%|███████              | 2/6 [00:00<00:00,  4.46it/s]No cell barcode information found in /mnt/vcc-data/competition_support_set/jurkat.h5. Generating generic barcodes.
Processed jurkat: 21412 train, 0 val, 0 test
Processing replogle_h1:  33%|███████              | 2/6 [00:00<00:00,  4.46it/s]No cell barcode information found in /mnt/vcc-data/competition_support_set/k562.h5. Generating generic barcodes.
Processed k562: 18465 train, 0 val, 0 test
Processing replogle_h1:  33%|███████              | 2/6 [00:00<00:00,  4.46it/s]No cell barcode information found in /mnt/vcc-data/competition_support_set/hepg2.h5. Generating generic barcodes.
Processed hepg2: 4976 train, 0 val, 9386 test
Processing replogle_h1: 100%|█████████████████████| 6/6 [00:00<00:00, 10.06it/s]
Traceback (most recent call last):
  File "/root/state/.venv/bin/state", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/root/state/src/state/__main__.py", line 120, in main
    run_tx_train(cfg)
  File "/root/state/src/state/_cli/_tx/_train.py", line 123, in run_tx_train
    dl = data_module.train_dataloader()
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/state/.venv/lib/python3.11/site-packages/cell_load/data_modules/perturbation_dataloader.py", line 338, in train_dataloader
    return self._create_dataloader(self.train_datasets, test=test)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/state/.venv/lib/python3.11/site-packages/cell_load/data_modules/perturbation_dataloader.py", line 372, in _create_dataloader
    sampler = PerturbationBatchSampler(
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/state/.venv/lib/python3.11/site-packages/cell_load/data_modules/samplers.py", line 83, in __init__
    self.sentences = self._create_sentences()
                     ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/state/.venv/lib/python3.11/site-packages/cell_load/data_modules/samplers.py", line 257, in _create_sentences
    subset_batches = self._process_subset(global_offset, subset)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/state/.venv/lib/python3.11/site-packages/cell_load/data_modules/samplers.py", line 207, in _process_subset
    cell_codes = cache.cell_type_codes[indices]
                 ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
IndexError: arrays used as indices must be of integer (or boolean) type

@abhinadduri abhinadduri merged commit a5688a8 into ArcInstitute:main Sep 17, 2025
@abhinadduri
Copy link
Collaborator

thanks for fixing @bzrry !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants