Generating embeddings after model training #14
-
Hello, Thank you for this great work! I am interested in generating embeddings from a gpn model I've trained. Specifically I would like to re-create a plot similar to UMAP Fig.2 in the paper. I can see from the embedding_umap notebook that 2 parquet files are used, one with the windows and one with model embeddings. How can I generate the windows.parquet file for arabidopsis, or other species? Also if I want to run this analysis on a different species/reference genomes, do I need to significantly adjust the Snakefile and any other file/folder to reproduce your analysis similarly to arabidopsis? Thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Hello!
I don't think you would encounter significant issues porting this to another species. Let me know! |
Beta Was this translation helpful? Give feedback.
Hello!
windows.parquet
is produced by rulesdownload_annotation
,expand_annotation
anddefine_embedding_windows
. It needs two input files:I don't think you would encounter significant issues porting this to another species. Let me know!