Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokenize new data #81

Open
eliotwalt opened this issue Sep 6, 2022 · 3 comments
Open

Tokenize new data #81

eliotwalt opened this issue Sep 6, 2022 · 3 comments

Comments

@eliotwalt
Copy link

Hi, I am trying to use the model on new data and struggle to reproduce the tokenization method to obtain the query_toks and query_toks_no_value fields. I tried using process_sql.tokenize which does not produce the same results as the dataset.
Is the code for this provided somewhere? Thanks.

@FruVirus
Copy link

FruVirus commented Feb 7, 2023

So i ran into the same issue. I believe if you just don't lowercase the word within process_sql.tokenize, that should do the trick.

@FruVirus
Copy link

FruVirus commented Feb 7, 2023

however, i have not figured out how to get the query_toks_no_value field

@listentomi
Copy link

however, i have not figured out how to get the query_toks_no_value field

So do I. I want to know if you solve this problem. Can you tell me how to generate the query_toks_no_value field?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants