Skip to content

Return ensembl ID for each embedding for emb_mode="gene"#157

Merged
mattwoodx merged 8 commits intomainfrom
geneformer/return-gene-names
Dec 17, 2024
Merged

Return ensembl ID for each embedding for emb_mode="gene"#157
mattwoodx merged 8 commits intomainfrom
geneformer/return-gene-names

Conversation

@mattwoodx
Copy link
Copy Markdown
Contributor

  • When "gene" emb_mode is selected, previously we would just return all the raw embeddings. Now we return all gene embeddings in the form of pandas series, where the index of each pandas series is the ensembl ID mapping to the corresponding embeddings
  • Updated geneformer testing to follow that of scGPT, allowing for more granular CPU bound unit testing.

Comment thread helical/models/geneformer/fine_tuning_model.py
Comment thread helical/models/geneformer/geneformer_utils.py
Comment thread helical/models/geneformer/geneformer_utils.py Outdated
@notion-workspace
Copy link
Copy Markdown

Comment thread helical/models/classification/classifier.py
Comment thread ci/tests/test_geneformer/test_geneformer_model.py
Comment thread ci/tests/test_geneformer/test_geneformer_model.py Outdated
Comment thread ci/tests/test_geneformer/test_geneformer_model.py Outdated
Comment thread helical/models/geneformer/fine_tuning_model.py
@mattwoodx mattwoodx merged commit eede0cf into main Dec 17, 2024
@mattwoodx mattwoodx deleted the geneformer/return-gene-names branch December 17, 2024 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants