Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Producer: on adding 200 repos, only 178 are in DB #78

Closed
bzz opened this issue Jun 30, 2017 · 4 comments
Closed

Producer: on adding 200 repos, only 178 are in DB #78

bzz opened this issue Jun 30, 2017 · 4 comments
Milestone

Comments

@bzz
Copy link
Contributor

bzz commented Jun 30, 2017

  • borges producer --source=file --file ./top200repos.txt
  • got 200 messages in Rabbit, borges.buriedQueue empty
     curl -s -u guest:guest "http://localhost:8081/api/queues/%2F/borges" | jq .messages
    
  • got 178 records in DB
     select count(1) from repositories;
    

top200repos.txt
200 lines files, cat top200repos.txt | sort -u | uniq -d -c is empty so there is no duplicated and wc -l is 200

@bzz bzz added the bug Something isn't working label Jun 30, 2017
@bzz
Copy link
Contributor Author

bzz commented Jul 1, 2017

Same happens with 2000 repos from text file:

  • queue has 2000 elements,
  • but DB only 1814
testing=# select count(*) from repositories;
 count
-------
  1814

@ajnavarro
Copy link
Contributor

@bzz is not a Borges error, we have duplicated repos in python and java lists, like this: github.com/mihaic/graphalytics.git

@ajnavarro ajnavarro reopened this Jul 3, 2017
@ajnavarro
Copy link
Contributor

@bzz Also you can check it in the top200repos.txt file too: sort top200repos.txt | uniq --count.

I will close the issue, feel free to reopen if I'm wrong.

@ajnavarro ajnavarro modified the milestone: 0.1 Jul 3, 2017
@bzz
Copy link
Contributor Author

bzz commented Jul 4, 2017

Thanks for catching this! I believe the confusion is from sort -u above, wich already filters out all dupes.

sort top200repos.txt | uniq | wc -l 178
sort top200repos.txt | uniq | wc -l 1814

@bzz bzz removed the bug Something isn't working label Jul 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants