Skip to content

Add support for json-lines in bulk loader #3819

@dmsolow

Description

@dmsolow

Json-lines (http://jsonlines.org/) is a commonly used format for storing a large number of JSON objects in a file. It's better than a single JSON array of objects because it makes it easy to read a file object by object without loading the entire thing into memory.

Popular big data processing frameworks like Apache Spark write JSON-lines natively (df.write.json("out.json") writes a JSON-lines file for each partition)

Support would probably be trivial to add for Dgraph and it would help people easily integrate Dgraph into existing ETL workflows.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Stalearea/bulk-loaderIssues related to bulk loading.area/import-exportIssues related to data import and export.area/live-loaderIssues related to live loading.area/parsingIssues related to the parser or lexer.kind/featureSomething completely new we should consider.popularpriority/P2Somehow important but would not block a release.status/acceptedWe accept to investigate/work on it.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions