Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Specify the new fetch documents route #234

Closed
wants to merge 12 commits into from
Closed

Conversation

irevoire
Copy link
Member

@irevoire irevoire commented May 4, 2023

🤖 API Diff


Implemented in meilisearch/meilisearch#3570
Associated issue: meilisearch/meilisearch#3477

Summary

Specify the new fetch documents route


Changes

  • The GET /indexes/:uid/documents route has been updated with a new query parameter filter
  • A new POST /indexes/:uid/documents/fetch route has been created with all the parameters of the GET version above (limit, offset, fields, and filter)
  • A new error code has been introduced in case an error happens on the new filter fields: invalid_document_filter

Misc

  • Update OpenAPI specification file
  • Update telemetry datapoints

@irevoire irevoire added Ready For Review Feature specification must be reviewed. Implemented Feature specification has been implemented. OpenAPI Update OpenAPI specification. labels May 4, 2023
@irevoire irevoire requested review from gmourier and macraig and removed request for gmourier May 4, 2023 12:47
@github-actions
Copy link

github-actions bot commented May 4, 2023

🤖 API change detected:

Added (1)

  • POST /indexes/{indexUid}/documents/fetch

Modified (1)

  • GET /indexes/{indexUid}/documents
    • Query parameter added: filter

Powered by Bump

meili-bors bot added a commit to meilisearch/meilisearch that referenced this pull request May 4, 2023
3570: Get documents by filter r=irevoire a=dureuill

# Pull Request

## Related issue

Associated spec: meilisearch/specifications#234

None really, this is more of an extension of #3477: since after this issue we'll be able to delete documents by filter, it makes sense to also be able to get documents by filter. 

## What does this PR do?

### User standpoint

- Add a new `filter` URL parameter to `GET /indexes/{:indexUid}/documents` and a new `POST /indexes/{:indexUid}/documents/fetch` route with the same `offset, limit, fields, filter` 

### Implementation standpoint

-  Add a new `Index::iter_documents` method to iterate on a set of documents rather than return a vector of these documents.
- Rewrite the other `Index::*documents` methods to use the new `Index::iter_documents` method.

## Usage

<details>
<summary>
Sample request and response
</summary>

```
curl -X POST 'http://localhost:7700/indexes/index-1101/documents/fetch' -H 'Content-Type: application/json' --data-binary '{ "filter": "genres = Comedy", "limit": 3, "offset": 8000}' | jsonxf
```

```json
{
  "results": [
    {
      "id": 326126,
      "title": "Bad Exorcists",
      "overview": "A trio of awkward teens intend to win a horror festival by making their own movie, but wind up getting their actress possessed in the process.",
      "genres": [
        "Horror",
        "Comedy"
      ],
      "poster": "https://image.tmdb.org/t/p/w500/lwd65kPbjFacAw3QSXiwSsW6cFU.jpg",
      "release_date": 1425081600
    },
    {
      "id": 326215,
      "title": "Ooops! Noah is Gone...",
      "overview": "It's the end of the world. A flood is coming. Luckily for Dave and his son Finny, a couple of clumsy Nestrians, an Ark has been built to save all animals. But as it turns out, Nestrians aren't allowed. Sneaking on board with the involuntary help of Hazel and her daughter Leah, two Grymps, they think they're safe. Until the curious kids fall off the Ark. Now Finny and Leah struggle to survive the flood and hungry predators and attempt to reach the top of a mountain, while Dave and Hazel must put aside their differences, turn the Ark around and save their kids. It's definitely not going to be smooth sailing.",
      "genres": [
        "Animation",
        "Adventure",
        "Comedy",
        "Family"
      ],
      "poster": "https://image.tmdb.org/t/p/w500/gEJXHgpiKh89Vwjc4XUY5CIgUdB.jpg",
      "release_date": 1427328000
    },
    {
      "id": 326241,
      "title": "For Here or to Go?",
      "overview": "An aspiring Indian tech entrepreneur in the Silicon Valley finds himself unexpectedly battling the bizarre American immigration system to keep his dream alive or prepare to return home forever.",
      "genres": [
        "Drama",
        "Comedy"
      ],
      "poster": "https://image.tmdb.org/t/p/w500/ff8WaA7ItBgl36kdT232i0d0Fnq.jpg",
      "release_date": 1490918400
    }
  ],
  "offset": 8000,
  "limit": 3,
  "total": 9331
}
```

<img width="1348" alt="Capture d’écran 2023-03-08 à 10 09 04" src="https://user-images.githubusercontent.com/41078892/223670905-6932b79b-f9b8-4a41-b59e-be2171705b7d.png">



</details>

# Draft status

- [ ] Route naming: having one route be `GET /indexes/{:indexUid}/documents` and the other `POST /indexes/{:indexUid}/documents/fetch` is suboptimal (also, technically a breaking change for documents with `fetch` as uid?), but `POST /indexes/{:indexUid}/documents` is already used to insert documents.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Copy link
Contributor

@macraig macraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I'll add @gmourier as reviewer in case I've missed something.

@macraig macraig requested a review from gmourier May 4, 2023 15:22
text/0124-documents-api.md Outdated Show resolved Hide resolved
text/0124-documents-api.md Show resolved Hide resolved
text/0124-documents-api.md Show resolved Hide resolved
text/0124-documents-api.md Outdated Show resolved Hide resolved
@gmourier
Copy link
Member

gmourier commented May 5, 2023

There is no telemetry added with this feature. Is this the case? (Asking for double checking, it's easy to forget from my experience 😮‍💨)

@irevoire
Copy link
Member Author

irevoire commented May 5, 2023

There is no telemetry added with this feature. Is this the case? (Asking for double checking, it's easy to forget from my experience face_exhaling)

Yes! We didn't have the time to implement it (especially since there was no telemetry initially on the GET route either).
Either we wait, and I can add it here at some point during the next week, or I'll do a second spec later.

- [3.1.2. `GET` - `indexes/:index_uid/documents/:document_id`](#312-get---indexesindexuiddocumentsdocumentid)
- [3.1.3. `POST` - `indexes/:index_uid/documents`](#313-post---indexesindexuiddocuments)
- [3.1.4. `PUT` - `indexes/:index_uid/documents`](#314-put---indexesindexuiddocuments)
- [3.1.5. `DELETE` - `indexes/:index_uid/documents`](#315-delete---indexesindexuiddocuments)
- [3.1.6. `DELETE` - `indexes/:index_uid/documents/:document_id`](#316-delete---indexesindexuiddocumentsdocumentid)
- [3.1.7. `POST` - `indexes/:index_uid/documents/delete-batch`](#317-post---indexesindexuiddocumentsdelete-batch)

#### 3.1.1. `GET` - `indexes/:index_uid/documents`
#### 3.1.1. `GET` - `indexes/:index_uid/documents` and `POST` - `indexes/:index_uid/documents/fetch`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An unsolicited comment/feedback/question: is it wise to have two endpoints that do almost exactly the same thing, but in slightly different ways?

From a documentation perspective, it makes it harder for us to provide clear guidance on recommended practices and best use cases: "you can use A to do X, but you can also use B. B is slightly better for Y, but A works fine as well." This often results in users having to spend more time than they would want to on making a decision that might not be that important.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, personally, I think we should deprecate the GET version of the route, same for the search.
To me, basically;

  • The get route can be slightly easier to use with curl
  • The post route is better on absolutely everything else (like sending big filters or using the array syntax)

Copy link
Member

@guimachiavelli guimachiavelli May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecating GET would be fine for us. In case this is not possible (e.g. API stability and versioning), the second best solution would be to get clear confirmation from product and/or engine teams that we can recommend users to prefer POST and discourage GET usage.

IMHO, with this change GET becomes a net negative. Being slightly easier to use in a relatively small use-case is outweighed by the effort required in choosing between similar options.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guimachiavelli totally fine on our side to recommend users to use POST instead of GET. We can even go so far as to warn users that GET will be deprecated in the near future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brilliant, @macraig, thanks! We'll keep this in mind when documenting these changes and add you as a one of the reviewers to ensure our recommendations are aligned with the API design's intent 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, we'd also be very happy if GET /search were to be deprecated.

Copy link
Member

@brunoocasali brunoocasali May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, just to be sure are we going to deprecate the GET index/{index_uid}/documents right away or are we waiting until the v1.3?

Cause, I can already mark those getDocuments occurrences as deprecated and point the users to the new fetchDocuments method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brunoocasali We won't mark it as deprecated for v1.2, we are waiting for v1.3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we (integrations) follow that? Cause I don't see a world where waiting til v1.3 will make much difference for the users, a deprecation notice is not the same as "this method does not work anymore. Take this exception in your face instead".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we deprecate it, it'll continue to work until the v2 of meilisearch, right?
The user will never get an exception, I think? (but maybe you could deprecate it in the integration and get rid of the method before meilisearch v2 🤔)

text/0034-telemetry-policies.md Show resolved Hide resolved
text/0034-telemetry-policies.md Outdated Show resolved Hide resolved
text/0034-telemetry-policies.md Outdated Show resolved Hide resolved
Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com>
meili-bors bot added a commit to meilisearch/meilisearch that referenced this pull request May 16, 2023
3738: Add analytics on the get documents resource r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes #3737
Related spec meilisearch/specifications#234

## What does this PR do?
Add the analytics for the following routes:
- `GET` - `/indexes/:uid/documents`
- `GET` - `/indexes/:uid/documents/:doc_id`
- `POST` - `/indexes/:uid/documents/fetch`

These analytics are aggregated between two events:
- `Documents Fetched GET`
- `Documents Fetched POST`

That shares the same payload:
 Property name | Description | Example |
|---------------|-------------|---------|
| `requests.total_received` | Total number of request received in this batch | 325 |
| `per_document_id` | `false` | false |
| `per_filter` | `true` if `POST /indexes/:indexUid/documents/fetch` endpoint was used with a filter in this batch, otherwise `false` | false |
| `pagination.max_limit` | Highest value given for the `limit` parameter in this batch | 60 |
| `pagination.max_offset` | Highest value given for the `offset` parameter in this batch | 1000 |

Co-authored-by: Tamo <tamo@meilisearch.com>
meili-bors bot added a commit to meilisearch/meilisearch that referenced this pull request May 16, 2023
3738: Add analytics on the get documents resource r=dureuill a=irevoire

# Pull Request

## Related issue
Fixes #3737
Related spec meilisearch/specifications#234

## What does this PR do?
Add the analytics for the following routes:
- `GET` - `/indexes/:uid/documents`
- `GET` - `/indexes/:uid/documents/:doc_id`
- `POST` - `/indexes/:uid/documents/fetch`

These analytics are aggregated between two events:
- `Documents Fetched GET`
- `Documents Fetched POST`

That shares the same payload:
 Property name | Description | Example |
|---------------|-------------|---------|
| `requests.total_received` | Total number of request received in this batch | 325 |
| `per_document_id` | `false` | false |
| `per_filter` | `true` if `POST /indexes/:indexUid/documents/fetch` endpoint was used with a filter in this batch, otherwise `false` | false |
| `pagination.max_limit` | Highest value given for the `limit` parameter in this batch | 60 |
| `pagination.max_offset` | Highest value given for the `offset` parameter in this batch | 1000 |

Co-authored-by: Tamo <tamo@meilisearch.com>
Copy link
Member

@gmourier gmourier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @irevoire !

@irevoire
Copy link
Member Author

Seen it with @gmourier.
Instead of merging the get and the delete PR, we’re going to close this PR and only merge #236, which contains the specification for both features.

From what I see, all comments have been addressed on this PR. If I missed something or you want to add something, please do it on #236.

@irevoire irevoire closed this May 22, 2023
@irevoire irevoire deleted the fetch-documents branch May 22, 2023 10:01
meili-bors bot added a commit to meilisearch/meilisearch-js that referenced this pull request May 29, 2023
1493: Add the filter field in getDocuments for Meilisearch v1.2 r=bidoubiwa a=bidoubiwa

as per the specification: meilisearch/specifications#234

- Add the `filter` field on the `getDocuments` route (TS)

The `filter` field works precisely like the `filter` field used on the `search` method. See [the docs on how to use filters](https://www.meilisearch.com/docs/learn/advanced/filtering#filter-basics).

Co-authored-by: Charlotte Vermandel <charlottevermandel@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Implemented Feature specification has been implemented. OpenAPI Update OpenAPI specification. Q2:2023 Ready For Review Feature specification must be reviewed. Telemetry Update the telemetry collect. v1.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants