Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need a custom Log Format #149

Closed
prazanna opened this issue Apr 5, 2017 · 0 comments
Closed

Need a custom Log Format #149

prazanna opened this issue Apr 5, 2017 · 0 comments
Assignees
Projects

Comments

@prazanna
Copy link
Contributor

prazanna commented Apr 5, 2017

Today we use Avro.

Drawbacks:

  1. A file has a schema, and all objects stored in the file must be written according to that schema. Hoodie Log file schema can change.
  2. Need for atomically publishing a Avro block, requires rolling back with a tombstone.
  3. We need custom metadata like the delta commit time for the block
  4. AvroReader will fail to scan a avro file which has a partial written block (because of a failed commit). We need to skip to the next block and we need to manualy scan till the sync marker.

We need a custom format to help with the above pains. The data can still be avro serialized, but the file format needs to be more flexible to store metadata and sync markers.

@prazanna prazanna self-assigned this Apr 5, 2017
@prazanna prazanna added this to 05/22/2017 in Sprint May 22, 2017
@prazanna prazanna modified the milestone: 0.3.8 Jun 15, 2017
@vinothchandar vinothchandar moved this from 05/22/2017 to Done in Sprint Jul 24, 2017
vinishjail97 pushed a commit to vinishjail97/hudi that referenced this issue Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

1 participant