Skip to content
Discussion options

You must be logged in to vote

Correct, there's no de-duplication inside of rustac. It's not too hard to do yourself:

old_items = await rustac.read("items.parquet")
old_item_ids = set(item["id"] for item in old_items["features"])
await rustac.write("update-items.parquet", old_items["features"] + list(item for item in new_items if item.id not in old_item_ids)

I'm not sure it belongs in rustac because "de-duplication" is use-case specific ... one user might want to de-deuplicate on id, another might want to de-duplicate on id+collection, another by id+version, etc...

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@betolink
Comment options

Answer selected by betolink
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants