Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(orama): adds zip tree #430

Merged
merged 6 commits into from
Sep 3, 2023
Merged

feat(orama): adds zip tree #430

merged 6 commits into from
Sep 3, 2023

Conversation

micheleriva
Copy link
Member

@micheleriva micheleriva commented Jul 1, 2023

This PR aims to add an initial implementation of a Zip Tree to substitute AVL Trees on indexes with a lot of numeric properties. It's 100% backward compatible with the AVL Trees and exposes the same APIs.

@vercel
Copy link

vercel bot commented Jul 1, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
orama-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 3, 2023 5:38pm

@hastebrot
Copy link

hastebrot commented Jul 1, 2023

Looks great!

I'm not sure, Math.random() should be replaced by randomRank(), but I have to look more closer. I guess we can just use a distribution instead of multiple calls to Math.random()?

function randomRank(): number {
  let heads = 0;
  while (Math.random() < 0.5) {
    heads += 1;
  }
  return heads;
}

@micheleriva
Copy link
Member Author

micheleriva commented Aug 24, 2023

@hastebrot thanks for the suggestion, I added it as follows:

function randomRank (maxAttempts = 32): number {
  let heads = 0

  for (let i = 0; i < maxAttempts; i++) {
    if (Math.random() >= 0.5) break
    heads += 1
  }

  return heads
}

@thebergamo
Copy link

@hastebrot thanks for the suggestion, I added it as follows:

function randomRank (maxAttempts = 32): number {

  let heads = 0



  for (let i = 0; i < maxAttempts; i++) {

    if (Math.random() >= 0.5) break

    heads += 1

  }



  return heads

}

I was thinking if adding a better random algorithm like one from stdlibjs?

@micheleriva
Copy link
Member Author

@hastebrot thanks for the suggestion, I added it as follows:

function randomRank (maxAttempts = 32): number {

  let heads = 0



  for (let i = 0; i < maxAttempts; i++) {

    if (Math.random() >= 0.5) break

    heads += 1

  }



  return heads

}

I was thinking if adding a better random algorithm like one from stdlibjs?

It's ok, but we can't use external dependencies.

@hastebrot
Copy link

hastebrot commented Aug 24, 2023

// Function to generate a random number from a geometric distribution.
function randomGeom(p, r) {
    return Math.floor(Math.log(1 - r) / Math.log(1 - p)) + 1;
}

const p = 0.5; // Probability of getting a head in a fair coin flip.
const r = Math.random(); // Generate a random number between 0 and 1.
const numberOfFlips = randomGeom(p, r);
> randomGeom(0.5, 0)
1
> randomGeom(0.5, 0.49999)
1
> randomGeom(0.5, 0.5)
2
> randomGeom(0.5, 0.9)
4
> randomGeom(0.5, 0.999)
10
> randomGeom(0.5, 0.99999)
17

so to get numberOfFlips = 17 (rank 16 in the binary tree), it is sufficient to have r >= 0.99999, if I'm correct

> 1 - 1 / Math.pow(2, 17 - 1)
0.9999847412109375

update: randomRank() is randomGeom() - 1 (i.e. numberOfFlips - 1). so we have to subtract 1 from the function result, i.e. randomRank() returned 0 (not 1) in 50% of the cases.

function randomRank(r: number): number {
    return Math.floor(Math.log(1 - r) / Math.log(1 - 0.5));
}
> [...Array(20).keys()].map(() => randomRank(Math.random()))
[
  0, 0, 3, 0, 0, 2, 2,
  0, 0, 1, 1, 0, 0, 0,
  4, 0, 7, 0, 1, 0
]

I just don't know if Math.random() can return 1.0. randomRank(1) is Infinity. but then you can replace Infinity with 32 (maxAttempts).

Checked it: Math.random() returns a value 0 <= value < 1 or as per ECMAScript standard:

Returns a Number value with positive sign, greater than or equal to 0 but less than 1

Also randomRank(0) returns -0, so we should call Math.abs() on the result or add an if condition with if (r === 0) return 0.

Which gives us

function randomRank(r: number): number {
    if (r === 0) return 0;
    return Math.floor(Math.log(1 - r) / Math.log(1 - 0.5));
}

@hastebrot
Copy link

@thebergamo it's a jungle of code but I managed to find the line with geometric distribution in stdlibjs. what is cool is that they even have a proof for it in the jsdocs.

https://github.com/stdlib-js/stdlib/blob/v0.0.96/lib/node_modules/%40stdlib/random/base/geometric/lib/geometric.js#L78

but in the end it's really just a single line of code.

@micheleriva
Copy link
Member Author

I'm going to merge this PR without replacing AVL tree for the number index. I'll dedicate a separate PR for this. Thanks @hastebrot for the great help 🙏

@micheleriva micheleriva merged commit 7a3a336 into main Sep 3, 2023
2 checks passed
@micheleriva micheleriva deleted the dsa/adds-zip-tree branch September 3, 2023 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants