Make API definitions consistent #54

davidmezzetti · 2021-01-08T22:18:01Z

With the additional functionality added to txtai over the last few releases, the API definitions have gotten somewhat inconsistent. This issue will address that and make many of the return types across modules consistent. The changes are breaking in many cases and will require a bump of the major version of txtai to v2.

The current Python API definitions for v1 are:

Current Python API v1

embeddings.search("query text")
return [(id, score)] sort score desc
embeddings.similarity("query text", documents)
return [score]
embeddings.add(documents)
embeddings.index()
embeddings.transform("text")
return [float]
extractor(sections, queue)
return [(name, answer)]
labels("text", ["label1"])
return [(label, score)] sort score desc

The new method templates and return types are below.

New Python API v2

embeddings.search("query text")
return [(id, score)] sort score desc
embeddings.batchsearch(["query text1", "query text2])
return [[(id, score)] sort score desc]
embeddings.add(documents)
embeddings.index()
embeddings.similarity("query text", texts)
return [(id, score)] sort score desc
embeddings.batchsimilarity(["query text1", "query text2], texts)
return [[(id, score)] sort score desc]
embeddings.transform("text")
return [float]
embeddings.batchtransform(["text1", "text2"])
return [[float]]
extractor(queue, texts)
return [(name, answer)]
labels("text", ["label1"])
return [(id, score)] sort score desc
labels(["text1", "text2"], ["label1"])
return [[(id, score)] sort score desc]
similarity("query text", texts)
return [(id, score)] sort score desc
batchsimilarity(["query text1", "query text2], texts)
return [[(id, score)] sort score desc]

External v2 API Calls

The API methods also need to have corresponding changes.

Given that json doesn't support tuples and some languages can't easily map arrays/tuples to objects, the return types are mapped from tuples to json objects. For example instead of (id, score) the API will return {"id": value, "score": value}.

The API also has the following differences with the native Python API.

extract uses the Extractor pipeline which is a callable object in Python.
label/batchlabel uses the Labels pipeline which is a callable object in Python that supports both string and list input.
similarity/batchsimilarity uses the Similarity pipeline which is a callable object in Python that supports both string and list input.

The following list shows how the API methods will look through language binding libraries.

embeddings.search("query text")
embeddings.batchsearch(["query text1", "query text2])
embeddings.add(documents)
embeddings.index()
embeddings.similarity("query text", texts)
embeddings.batchsimilarity(["query text1", "query text2], texts)
embeddings.transform("text")
embeddings.batchTransform(["text1", "text2"])
extractor.extract(questions, texts)
labels.label("text", ["label1"])
labels.batchlabel(["text1", "text2"], ["label1"])
similarity.similarity("query text", texts)
similarity.batchsimilarity(["query text1", "query text2], texts)

…ethods (#18, #53)

…age binding libraries

davidmezzetti self-assigned this Jan 8, 2021

davidmezzetti changed the title ~~Simplify and clean API definitions~~ Simplify and make API definitions consistent Jan 8, 2021

davidmezzetti added a commit that referenced this issue Jan 8, 2021

Refactor API definitions across modules (#54), add batch version of m…

e84d7c0

…ethods (#18, #53)

davidmezzetti closed this as completed Jan 8, 2021

This was referenced Jan 11, 2021

Sync with txtai 2.x API neuml/txtai.js#1

Closed

Sync with txtai 2.x API neuml/txtai.rs#1

Closed

davidmezzetti changed the title ~~Simplify and make API definitions consistent~~ Make API definitions consistent Jan 12, 2021

davidmezzetti added a commit that referenced this issue Jan 12, 2021

Update of #54 to return json objects vs tuples to help simplify langu…

a91defb

…age binding libraries

This was referenced Jan 12, 2021

Sync with txtai 2.x API neuml/txtai.java#1

Closed

Sync with txtai 2.x API neuml/txtai.go#1

Closed

davidmezzetti added this to the v2.0.0 milestone May 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make API definitions consistent #54

Make API definitions consistent #54

davidmezzetti commented Jan 8, 2021 •

edited

Loading

Make API definitions consistent #54

Make API definitions consistent #54

Comments

davidmezzetti commented Jan 8, 2021 • edited Loading

Current Python API v1

New Python API v2

External v2 API Calls

davidmezzetti commented Jan 8, 2021 •

edited

Loading