Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

borges-indexer fails to run with database schema from latest borges version #48

Closed
vmarkovtsev opened this issue Apr 12, 2018 · 22 comments
Assignees
Labels
bug Something isn't working

Comments

@vmarkovtsev
Copy link
Collaborator

I run borges consumer and it writes several siva files and records to the DB successfully.

Then I run borges-indexer and get

FATA[0004] unable to get result set                      err="pq: column __repository._references does not exist"
@vmarkovtsev
Copy link
Collaborator Author

After updating core-retrieval to master, I get

INFO[0004] start processing repos                        workers=32
WARN[0004] empty repository                              repo=0162bb0b-5d2a-5a9c-62cf-5a81779e5db9
WARN[0004] empty repository                              repo=0162bb0b-5d28-7a05-3aca-46f5d0c88c1f
WARN[0004] empty repository                              repo=0162bb0b-5d2d-89ae-6355-d442830057ee
WARN[0004] empty repository                              repo=0162bb0b-5d2c-cd4a-dab7-2c92e8fa4043
WARN[0004] empty repository                              repo=0162bb0b-5d2e-9274-f90c-7063fb2ee658
INFO[0004] finished processing all repositories          failed=0 processed=5 total=5

@erizocosmico
Copy link
Contributor

erizocosmico commented Apr 12, 2018 via email

@vmarkovtsev
Copy link
Collaborator Author

vmarkovtsev commented Apr 12, 2018

I need to find the proper commit where everything works. Is it possible in theory @erizocosmico or borges unsynced with indexer too much?

@vmarkovtsev
Copy link
Collaborator Author

p.dbRepo.References foreign key does not work for some reason. The schema seems to be in order...

@vmarkovtsev
Copy link
Collaborator Author

I looked through the code, everything looks fine but the foreign key is empty for some reason. I am really curious what the problem will be.

@ajnavarro
Copy link
Contributor

ajnavarro commented Apr 13, 2018

borges versions that will work with the old schema are 0.11.x ones. You can get the borges binary from here: https://github.com/src-d/borges/releases/tag/v0.11.4

The old schema had the references in jsonb format on a column in repositories table, we didn't have foreign keys.

@vmarkovtsev
Copy link
Collaborator Author

@ajnavarro I updated the core-retrieval package in borges-indexer locally and ran it, it uses exactly the same version as the modern borges now. It compiled and almost worked as seen in the logs... Would it be hard to update borges-indexer or at least point me where to investigate? The schema is the same on both ends - this means there should be an easy thing to fix.

@erizocosmico
Copy link
Contributor

There shouldn't really be anything to do in borges-indexer besides updating core-retrieval to the latest version.

@vmarkovtsev
Copy link
Collaborator Author

I assure you that this is what I did...

@vmarkovtsev
Copy link
Collaborator Author

I can post a DB dump here if you want.

@erizocosmico
Copy link
Contributor

Don't worry, I'll take a look whenever I take this issue. For the time being, use borges 0.11.x as it's the version that we used when this was written.

@vmarkovtsev
Copy link
Collaborator Author

This means writing siva files again, but it looks like the only way now.

@vmarkovtsev
Copy link
Collaborator Author

@ajnavarro @erizocosmico bump

@ajnavarro
Copy link
Contributor

I don't know if I'm wrong, but this is not a priority for us (@smola , @mcuadros ?). You can still use the borges version that we used to fetch PGA, and then use the borges indexer.

@vmarkovtsev
Copy link
Collaborator Author

We are going to present these tools to the community on May 30th and they are currently broken.

@vmarkovtsev
Copy link
Collaborator Author

The issue is aligned to src-d/okrs#14

@ajnavarro
Copy link
Contributor

Not at all in my opinion. The problem here is an outdated temporal tool created for a specific project is not working with the latest borges version. It's not working because we are updating and improving borges to reach that okr.

@smola
Copy link
Contributor

smola commented May 10, 2018

@vmarkovtsev Is there any problem with presenting the process for PGA generation as using a specific borges and borges-indexer version? You can even link to the exact GitHub release pages with binaries. At least for boreges. We could also publish here a working binary of borges-indexer if needed.

I don't see a problem in presenting and documenting borges-indexer as what it is: a quick tool done for generation of the first version of the dataset and that is likely to not be present in the process for future versions of the dataset.

@smola smola added the bug Something isn't working label May 10, 2018
@smola smola changed the title borges-indexer fails to run borges-indexer fails to run with database schema from latest borges version May 10, 2018
@vmarkovtsev
Copy link
Collaborator Author

@smola Recent borges versions include important bugfixes which allow to clone more repositories.
Most of the people there have Windows and we do not provide binary releases for it. This means that they have to clone a repo to the specific directory under src, fetch the specific revision which is known to work, build it and run it. Updating borges-indexer would allow to at least exclude the step with checking out the specific revision and stick with go get one-liner.

@ajnavarro
Copy link
Contributor

PROPOSAL

Update borges-indexer dependencies to make it work with the new schema on borges versions >= 0.12.x.

This changes will make borges-indexer fails with prior versions. On other words, with this new version will be impossible to make again the index file from the actual PostgreSQL-PGA database, that is using borges 0.11.x schema.

No other changes will be done on borges-indexer, like add new columns, just make it compatible with the new schema.

Caveats @smola @mcuadros ?

@smola
Copy link
Contributor

smola commented May 11, 2018

@vmarkovtsev will you need to use the up-to-date borges-indexer with out current (old) PostgreSQL PGA database?

#48 (comment)

@vmarkovtsev
Copy link
Collaborator Author

@smola There is no such need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants