Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix(bulk): Correctly append all the slices of cbuf while reduce phase #7110

Merged
merged 3 commits into from
Dec 11, 2020

Conversation

ahsanbarkati
Copy link
Contributor

@ahsanbarkati ahsanbarkati commented Dec 10, 2020

The bulk loader was not considering the last slice of cbuf in appendToList
function. In condition next >= 0 && (next < end || end == -1) the part end == -1 ensures that the last slice is considered because for the last slice, end is -1 hence next < end is not true. This fixes the issue of index not being seen in a bulk loaded data, the index was missing because index key is the last key and this bug causes the last key to get missing.


This change is Reviewable

@github-actions github-actions bot added the area/bulk-loader Issues related to bulk loading. label Dec 10, 2020
@netlify
Copy link

netlify bot commented Dec 10, 2020

✔️ Deploy preview for dgraph-docs ready!

🔨 Explore the source changes: a2d4528

🔍 Inspect the deploy logs: https://app.netlify.com/sites/dgraph-docs/deploys/5fd248d98c35150007061a0a

😎 Browse the preview: https://deploy-preview-7110--dgraph-docs.netlify.app

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r1, 1 of 1 files at r2.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @vvbalaji-dgraph)

@ahsanbarkati ahsanbarkati merged commit 004ca03 into master Dec 11, 2020
@ahsanbarkati ahsanbarkati deleted the ahsan/bulk-index branch December 11, 2020 07:52
aman-bansal pushed a commit that referenced this pull request Dec 11, 2020
…#7110)

The bulk loader was not considering the last slice of cbuf in appendToList
function. This change fixes the issue.
danielmai pushed a commit that referenced this pull request Dec 23, 2020
…#7110)

The bulk loader was not considering the last slice of cbuf in appendToList
function. This change fixes the issue.
@danielmai
Copy link
Contributor

This bug fix applies to the data keys as well, not just the index. This is the count of nodes per predicate from bulk loader for the 21-million movie data set before and after this PR. Some predicates were previously missing a single node.

$ diff -u 1.txt 2.txt
--- 1.txt	2020-12-25 10:40:19.780857171 -0800
+++ 2.txt	2020-12-25 10:39:53.564583223 -0800
@@ -12,7 +12,7 @@
       6 collection.films_in_collection
     116 collections
     321 company.films
-     49 company_role_or_service.companies_performing_this_role_or_service
+     50 company_role_or_service.companies_performing_this_role_or_service
     142 content_rating.country
       9 content_rating.minimum_accompanied_age
     121 content_rating.minimum_unaccompanied_age
@@ -31,7 +31,7 @@
       1 cut.release_region
       4 cut.runtime
       2 cut.type_of_cut
-3508685 dgraph.type
+3508686 dgraph.type
   95043 director.film
      10 distribution_medium.films_distributed_in_this_medium
     877 distributor.films_distributed
@@ -66,13 +66,13 @@
      50 http://www.w3.org/2002/07/owl#inverseOf
  240856 initial_release_date
    7447 job.films_with_this_crew_job
- 191562 language
+ 191563 language
    2157 location.featured_in_films
      24 locations
    4952 metacritic_id
   89073 music
   34751 music_contributor.film
-2116315 name
+2116316 name
   72998 netflix_id
       1 nytimes_id
     386 other_companies

@ahsanbarkati
Copy link
Contributor Author

ahsanbarkati commented Dec 25, 2020

Yes @danielmai, the issue was that the last slice of the buffer was not processed. It has nothing to do with the indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/bulk-loader Issues related to bulk loading.
Development

Successfully merging this pull request may close these issues.

None yet

3 participants