Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for feedback: new product WRITE API with structured JSON body and new packagings structure #7564

Closed
stephanegigandet opened this issue Oct 17, 2022 · 20 comments
Labels
API WRITE WRITE API to allow sending product info and image 📦 Packaging https://wiki.openfoodfacts.org/Category:Recycling ⏰ Stale This issue hasn't seen activity in a while. You can try documenting more to unblock it. 📝 story

Comments

@stephanegigandet
Copy link
Contributor

stephanegigandet commented Oct 17, 2022

Hi Everyone! We are designing a new API so that clients can specify the different packaging components of a product. The packagings will be in a complex structure, so we are departing from the existing API that uses flat form fields to a new API that will accept directly a JSON structure as input. While we are focusing first on packagings, this new API can gradually be extended to support other product fields.

It would be great to get your feedback and suggestions on this new API, for packagings in particular, but also for updating products in general.

The API is described in OpenAPI in this PR: #7553

You can see auto-generated documentation from this OpenAPI here: https://redocly.github.io/redoc/?url=https://raw.githubusercontent.com/openfoodfacts/openfoodfacts-server/packagings-api/docs/reference/api-v3.yml#operation/post-api-v3-product-barcode

To discuss, you can use this bug, and/or #packagings and #api on Slack.

@raphael0202
Copy link
Contributor

Nice proposal! Here is my feedback:

  • I would rather avoid having userid and password in the body, as it's independent from what's requested. We could store it in headers using HTTP Basic authentication (https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication) in the absence of API tokens.
  • why adding packagings and packagings_add under a product field? For simplicity, we could directly store it at the root of the body.

@monsieurtanuki
Copy link
Contributor

@raphael0202 For the record in off-dart when we use userId / password we do that for WRITE operations (like here about saving the packaging) and we use POST syntax - all parameters will be in the body, not in the URL.
The userId is also in the header, as From: .
(Not the same when we call Robotoff).

@raphael0202 Good question about the product field: it looks like currently we don't use one and just put everything that belongs to a product directly to the root.

@stephanegigandet Generally speaking, anything that can improve the READ / WRITE is welcome! Especially with fields that works in both READ and WRITE.

I have no special interest in the _add syntax (off-dart, Smoothie).

Regarding JSON data, we could use

  • one line for each component (e.g. {material: "en:pet-polyethylene-terephthalate",recycling: "en:recycle", shape: "en:bottle"})
  • one optional meta-line for items like "user says it's complete" and "we think something is missing"

I haven't understood how the data will be translated for the user, e.g. in Smoothie. What are we supposed to do with en:recycle - will there be a way to say "Please give me the translations too" in READ? Or are we supposed to download the taxonomy?

@stephanegigandet
Copy link
Contributor Author

Thanks for the feedback @raphael0202 and @monsieurtanuki

  1. Authentication in the body vs HTTP Basic Auth

We could use HTTP Basic Auth for authentication. It would conflict with some of the things we do (like having the extra "off:off" password protection on .net), but we could change that.

I propose this:

  • for ease of code and deployment in the current setup, we stick with userid and password in the body for the immediate future (as we have a tight deadline for the packagings project, I would prefer to focus on packagings instead of spending time debugging unforeseen issues on how our nginx + Apache + mod_perl stack let us handle basic auth as we want).
  • as soon as possible, we add HTTP Basic auth has an alternate way to authenticate
  • we deprecate using userid and password in the body
  1. Regarding the product field

It is there to clearly separate the "context" of the request, versus what we want to update in the product.

image

  • userid, password, lc and cc are about the user
  • "fields" is the list of fields we want to receive back

I would like to have a similar structure for all API calls (read or write).

So I think having the "products" structure make things much cleaner. Currently we use it only for "packagings", but in the future, we will use it for all the product fields. If we put everything at the root, then we will have dozens of fields at the root, and it will be very unclear if "cc" and "lc" apply to the user or to the product. In fact today we have a "lc" field for the product as well (it contains the main language of the product).

  1. How to state if we think the list of packaging components is likely complete, or likely missing something

I would prefer to use a separate field for that (e.g. packagings_status or something), and keep the packagings array of packaging component consistent, without special lines that would need to be interpreted differently.

@alexgarel
Copy link
Member

alexgarel commented Oct 18, 2022

I strongly agree with @raphael0202 that we should put user / password in headers. It's meta information on the request, so it should not be in the request body.

Why adding packagings and packagings_add under a product field? For simplicity, we could directly store it at the root of the body.

I think it's to be consistent with the general API where you use the fields parameter to tell which fields you'd like to get. Also it's better because you clearly separate product data from eventual metadata on the request/response itself (like a status, status_verbose)

@stephanegigandet
Copy link
Contributor Author

I haven't understood how the data will be translated for the user, e.g. in Smoothie. What are we supposed to do with en:recycle - will there be a way to say "Please give me the translations too" in READ? Or are we supposed to download the taxonomy?

@monsieurtanuki : very good point. I think a way to say "Please give me the translations too" would be very useful indeed.

We can think of something like what we have for tags today: https://world.openfoodfacts.org/product/3017620422003/nutella-ferrero?fields=categories_tags,categories_tags_fr

Or we could just add properties like shape_lc and material_lc, and their value would be dependent on the "lc" field passed in the query.

@alexgarel
Copy link
Member

we don't necessarily have to use basic-auth for the authentication headers, we can use a specific header.

@stephanegigandet
Copy link
Contributor Author

we don't necessarily have to use basic-auth for the authentication headers, we can use a specific header.

It may be simpler for clients if we stick with something "basic" instead of having our own header. HTTP Basic Auth is good I think, it's just that I'd prefer to do it a bit later. :)

@raphael0202
Copy link
Contributor

It may be simpler for clients if we stick with something "basic" instead of having our own header. HTTP Basic Auth is good I think, it's just that I'd prefer to do it a bit later. :)

As we're on a tight schedule it makes sense 👍

Or we could just add properties like shape_lc and material_lc, and their value would be dependent on the "lc" field passed in the query.

We could have a single product->lc field to request translations in requested language for all taxonomized fields, it would be simpler.

@stephanegigandet
Copy link
Contributor Author

We could have a single product->lc field to request translations in requested language for all taxonomized fields, it would be simpler.

I like the idea of being able to request a specific language for all taxonomized fields.

product->lc is already used to store the main language of the product, so I'd prefer not to change it, but we could have a tags_lc field for instance.

The simplest would be to use the same field name for the translated fields and the untranslated fields.

e.g. if we have internally categories_tags: ["en:coffee", "fr:café très spécial"]

The following queries would return:

  • fields=categories_tags -> ["en:coffee", "fr:café très spécial"] (no tags_lc, we return the tags ids)

  • fields=categories_tags&tags_lc=en -> ["coffee", "fr:café très spécial"] (we don't have an English translation for the untaxonomized entry "fr:café très spécial", so we return it as-is)

  • fields=categories_tags&tags_lc=fr -> ["café", "café très spécial"]

This would work well for both reading and writing.

The only downside is that you can't get both the tag id and a translation in the same request, as we would need fields with different names.

@raphael0202
Copy link
Contributor

The only downside is that you can't get both the tag id and a translation in the same request, as we would need fields with different names.

or we can return for tags_lc=fr:

[
  {"tag": "en:coffee", "lc_name": "café"},
  {"tag": "fr:café très spécial", "lc_name": "café très spécial"},
]

and for tags_lc=en:

[
  {"tag": "en:coffee", "lc_name": "coffee"},
  {"tag": "fr:café très spécial", "lc_name": null},
]

@stephanegigandet
Copy link
Contributor Author

[
  {"tag": "en:coffee", "lc_name": "café"},
  {"tag": "fr:café très spécial", "lc_name": "café très spécial"},
]

We can do something like that indeed. So for packaging components we would have something like:

{
  "tags_lc": "fr",
  "packagings":
  [
    {
       "number": 1,
       "shape": {"id": "en:bottle", "lc_name": "bouteille"},
       "material": {"id": "en:glass", "lc_name": "verre"}
    }
  ]
}

That would be for reading, and for writing, we could accept either the "id" or the "lc_name":

{
  "tags_lc": "fr",
  "packagings":
  [
    {
       "number": 1,
       "shape": {"id": "en:bottle"},
       "material": {"lc_name": "verre"}
    }
  ]
}

@stephanegigandet
Copy link
Contributor Author

stephanegigandet commented Oct 18, 2022

I have updated the PR to include a new response status with a list of warnings and errors, that we will be able to reuse for all read and write requests.

https://redocly.github.io/redoc/?url=https://raw.githubusercontent.com/openfoodfacts/openfoodfacts-server/packagings-api/docs/reference/api-v3.yml#operation/post-api-v3-product-barcode

@harryblam
Copy link

harryblam commented Oct 19, 2022

hi @stephanegigandet ,

Thanks for the updates and the integration of the warnings / errors. It looks great from my side, just a couple of final comments - in case useful:

1) confirmation that v2 -> v3 api clients may break
The status property in the v3 api responses is now an enum string. I think status was a number in the v2 right? Just raising for visibility as it may be a non-obvious breaking change for those who update from the v2 to v3 api.

2) a small note on terminology.
As I understand:
a) The number property on a packaging object relates to how many of the given packaging there is.
b) the quantity property relates to the how much the piece of packaging can hold.

My suggestions:
For me, using number for a) is not immediately understandable (also is number a reserved word in some programming languages)? I think that quantity would be a better word to use for a).

And for b) we can use a different word related to the capacity, something like: capacity, size, volume, contents_amount.

@stephanegigandet
Copy link
Contributor Author

Hi @harryblam , thanks a lot for your feedback!

1) re: confirmation that v2 -> v3 api clients may break

It depends on the API requests, for updating products in v2, status is a number, either 0 or 1. And for uploading images, status was a string, either "status ok" or "status not ok".

For API V3, I would like to standardize the requests and responses so that they share the same structure, for both read and write requests, for searching, updating products, uploading images.

Maybe we could at least rename the new status field, for instance to status_code, so that it's more obvious that there is a change.

re: 2) a small note on terminology.

a) The number property on a packaging object relates to how many of the given packaging there is.
b) the quantity property relates to the how much the piece of packaging can hold.

That's correct.

I would prefer not to rename "number" to "quantity" as we currently use the "quantity" field in the schema for product for the contained quantity (either weight or volume).

For products, quantity is something like "20 cl", "500g", "4x25g", "600g (6x100 g)" (it's in free text), that's why it seemed logicial to have the same field for each packaging component. But I understand it could be mistaken for the amount of packaging components.

Maybe "amount" or "units" (or even something longer like "number_of_units" ?) instead of "number"?
And "quantity_per_unit" instead of "quantity"?

If we have longer names, it's probably more immediately understandable.

What do you think?

@harryblam
Copy link

@stephanegigandet no worries! I think there is a lot of context i'm missing from existing OFF apis / services so feel free to ignore if it's opening up too much complexity.

1) re: confirmation that v2 -> v3 api clients may break

Maybe we could at least rename the new status field, for instance to status_code, so that it's more obvious that there is a change.

status_code seems too close to HTTP status code in my opinion so might leave room for confusion. How about status_id?

If not, I think the original status does still work. If people are upgrading their api clients from v2 to v3 then they would at least test it first to ensure it works! Perhaps I'm over-complicating things here 😅

re: 2) a small note on terminology.
+1 for number_of_units instead of number and +1 for quantity_per_unit instead of quantity.

It's more readable, and leaves less room for mis-interpretation from a partner app perspective 👍

@stephanegigandet
Copy link
Contributor Author

status_code seems too close to HTTP status code in my opinion so might leave room for confusion. How about status_id?

status_id sounds good, thank you :)

I'll rename the fields in the proposal.

I feel we're getting to something quite good, so I will soon start the implementation of the API, and of course we can still change things as we go along.

@teolemon teolemon added 📦 Packaging https://wiki.openfoodfacts.org/Category:Recycling API WRITE WRITE API to allow sending product info and image labels Oct 31, 2022
@teolemon
Copy link
Member

teolemon commented Nov 6, 2022

Does the server handles saving input precision (1g, 0,1g, 0,01g) ?

@stephanegigandet
Copy link
Contributor Author

Does the server handles saving input precision (1g, 0,1g, 0,01g) ?

This was discussed during the packagings weekly call, some notes:

  • We save weight as a number, so we will save precision except when the last number is 0.
  • It's very likely that sub-gram precision is in fact insignificant digits (variation between packages, drops of water, different values from the stated 0.01g precision of scales)
  • Nevertheless, some people might want to store very precise numbers. We will store the numbers as entered by users, but in most cases, we will only display rounded values (grams).

Copy link
Contributor

This issue has been open 90 days with no activity. Can you give it a little love by linking it to a parent issue, adding relevant labels and projets, creating a mockup if applicable, adding code pointers from https://github.com/openfoodfacts/openfoodfacts-server/blob/main/.github/labeler.yml, giving it a priority, editing the original issue to have a more comprehensive description… Thank you very much for your contribution to 🍊 Open Food Facts

@github-actions github-actions bot added the ⏰ Stale This issue hasn't seen activity in a while. You can try documenting more to unblock it. label Dec 27, 2023
@stephanegigandet
Copy link
Contributor Author

This has been implemented in API v3: https://openfoodfacts.github.io/openfoodfacts-server/api/ref-v3/
Closing this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API WRITE WRITE API to allow sending product info and image 📦 Packaging https://wiki.openfoodfacts.org/Category:Recycling ⏰ Stale This issue hasn't seen activity in a while. You can try documenting more to unblock it. 📝 story
Projects
None yet
Development

No branches or pull requests

6 participants