Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing custom sqlite file #10

Closed
choudharya3 opened this issue Sep 8, 2020 · 3 comments
Closed

Processing custom sqlite file #10

choudharya3 opened this issue Sep 8, 2020 · 3 comments
Assignees
Milestone

Comments

@choudharya3
Copy link

I want to create an index and vector file over a Custom sqlite articles database. I have created a articles.sqlite database on medical papers, using paperetl. But I did not find any instruction as to how to process it . Can you please give instructions on this ?

@davidmezzetti
Copy link
Member

Here are the steps:

# Create vector file, currently builds vector file in ~/.cord19
python -m paperai.vectors <path to directory containing articles.sqlite>

# Index the data
python -m paperai.index <path to directory containing articles.sqlite>

I created issue #11 to add additional command line parameters to control the vector model creation process but right now the output file path is hardcoded

@choudharya3
Copy link
Author

Hi David,
Thank you very much for prompt response.

  1. Vector file : It worked. I was actually passing the full path of the .sqlite file ( including the filename 'articles.sqlite' ). So, it was not working. It worked now, I see 'vectors' folder created in the path ~/.cord19

  2. On the Index making step : A folder 'models' is not automatically created on the path ~/.cord19 . Instead 4 relevant files were created there. I created a folder 'models' and then put these 4 files inside the folder. Then it worked.

Thanks very much for your help.

@davidmezzetti
Copy link
Member

Marking issue as resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants