Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code and models are missing #1

Open
ddofer opened this issue Sep 25, 2023 · 3 comments
Open

Code and models are missing #1

ddofer opened this issue Sep 25, 2023 · 3 comments

Comments

@ddofer
Copy link

ddofer commented Sep 25, 2023

Hi, I read the paper - but I see the repo is empty of code or models.
Notably, I want to see if your textual pretraining data filtered out cases that appear (or are similar, by BLAST or the like) to any in the TAPE eval set. (e.g. like we did in ProteinBERT https://github.com/nadavbra/protein_bert )

@chao1224
Copy link
Owner

chao1224 commented Dec 1, 2023

Hi @ddofer,

Thank you for the questions.

  • We will release the codes and models once our manuscript is officially published.
  • To your second question, we double-checked the SwissProtCLAP and TAPE datasets (train & eval & test), and there are no shared protein sequences.

@Amelie-Schreiber
Copy link

When will the paper be published and when will the code be subsequently released?

@chao1224
Copy link
Owner

chao1224 commented Jan 2, 2024

Hi @Amelie-Schreiber, our manuscript is now in submission. We will release the code once it is accepted. Meanwhile, you can check the latest version here.

chao1224 added a commit that referenced this issue Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants