Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve and simplify operations on external documents ids #3379

Open
Kerollmops opened this issue Jan 25, 2021 · 14 comments
Open

Improve and simplify operations on external documents ids #3379

Kerollmops opened this issue Jan 25, 2021 · 14 comments
Labels
contribution accepted This issue accepts external contributions help wanted Extra attention is needed maintenance Issue about maintenance (CI, tests, refacto...) milli Related to the milli workspace

Comments

@Kerollmops
Copy link
Member

We need to remove the soft and hard external documents ids maps because it is just a lot of code to maintain that is not worth it. That is only interesting on the first additions of a small number of documents.

@Kerollmops Kerollmops added the enhancement New feature or improvement label Jan 25, 2021
@curquiza curquiza added the help wanted Extra attention is needed label Aug 19, 2021
@Kerollmops
Copy link
Member Author

Now that we added soft deletion into milli, this ExternalDocumentsIds struct is even useless. I would like to remove it and just replace it with a single fst::Map. It would be easier to understand and work with than a double soft + hard combo.

One requirement is a performance test just to make sure it doesn't impact the indexation time when adding or removing documents.

@yenwel
Copy link

yenwel commented Oct 12, 2022

I'd like to try this

@curquiza
Copy link
Member

Hello @yenwel

Thanks for your interest in this project 🔥 You are definitely more than welcome to open a PR for this!

FYI, we prefer not assigning people to our issues because sometimes people ask to be assigned and never come back, which discourages the volunteer contributors from opening a PR to fix this issue.
We will accept and merge the first PR that fixes correctly and well implements the issue following our contributing guidelines.

We are looking forward to reviewing your PR 😊

@yenwel
Copy link

yenwel commented Oct 13, 2022

Hi I'm having problem getting started following the contributing guidelines. If I try to do cargo run --release I get the error "error: a bin target must be available for cargo run"

@curquiza
Copy link
Member

curquiza commented Oct 13, 2022

Hum weird, I don't have the issue

Two possibilities

  • update your rust version: rustup update
  • maybe you have launched cargo run --release in a sub folder of the project. Run it at the root of Meilisearch

@yenwel
Copy link

yenwel commented Oct 13, 2022

I run this at the root of the project Milli. If I run in the subproject of e.g. cli this works fine.

@curquiza
Copy link
Member

Oh sorry, I realized I confused this repo with another one!
In this repo, we indeed not have any binary to create, you need to launch directly the tests with cargo test, I will update the CONTRIBUTING!

bors bot referenced this issue in meilisearch/milli Oct 13, 2022
663: Fix CONTRIBUTING.md step to make the project work r=Kerollmops a=curquiza

Following this discussion: https://github.com/meilisearch/milli/issues/76#issuecomment-1277459125

Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
@yenwel
Copy link

yenwel commented Oct 13, 2022

Oh snap I should have launched a PR myself to get some hacktoberfest points

@curquiza
Copy link
Member

Oh sorry! You are right! I wanted to be quick to avoid other confusion, but I should have asked you if you wanted to do it
However you have a lot of issues that are still unsolved in our organization if you are interesting in contributing 😄
https://github.com/search?q=org%3Ameilisearch+label%3Ahacktoberfest&state=open&type=Issues

@yenwel
Copy link

yenwel commented Oct 13, 2022

@Kerollmops do you want to completely remove the struct or replace its internal dual structure of soft and hard maps with just one map. Just deleting it's seems not trivial since it also now has a member for the softdeletedids.

@Kerollmops
Copy link
Member Author

@yenwel I would like to replace it with a single map, please.

@msvaljek
Copy link
Contributor

msvaljek commented Oct 17, 2022

@yenwel I would like to replace it with a single map, please.

Here's the replacement of soft and hard with a single map: meilisearch/milli#666

tbd: need to figure out benchmarking in https://github.com/meilisearch/milli/tree/main/benchmarks

@msvaljek
Copy link
Contributor

Sadly I won't be able to bring meilisearch/milli#666 to end due to obligations - issue is free for taking again, I'm sorry 🙇

@curquiza
Copy link
Member

curquiza commented Jan 2, 2023

No problem @msvaljek, thanks for letting us know!

For anyone wanting to fix this issue, the meilisearch/milli#666 PR is a good start, you just need to apply Kero's requests of change.

Thanks again for your time @msvaljek

@curquiza curquiza added the milli Related to the milli workspace label Jan 16, 2023
@curquiza curquiza transferred this issue from meilisearch/milli Jan 18, 2023
@curquiza curquiza added the contribution accepted This issue accepts external contributions label Jan 18, 2023
@curquiza curquiza added maintenance Issue about maintenance (CI, tests, refacto...) and removed enhancement New feature or improvement labels Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution accepted This issue accepts external contributions help wanted Extra attention is needed maintenance Issue about maintenance (CI, tests, refacto...) milli Related to the milli workspace
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants