A JavaScript Content Addressable aRchive (CAR) file reader and writer.
See also:
- Original Go implementation
- CAR specification
- IPLD
// Create a simple .car file with a single block and that block's CID as the
// single root. Then read the .car and fetch the block again.
import fs from 'fs'
import { Readable } from 'stream'
import { CarReader, CarWriter } from '@ipld/car'
import * as raw from 'multiformats/codecs/raw'
import { CID } from 'multiformats/cid'
import { sha256 } from 'multiformats/hashes/sha2'
async function example () {
const bytes = new TextEncoder().encode('random meaningless bytes')
const hash = await sha256.digest(raw.encode(bytes))
const cid = CID.create(1, raw.code, hash)
// create the writer and set the header with a single root
const { writer, out } = await CarWriter.create([cid])
Readable.from(out).pipe(fs.createWriteStream('example.car'))
// store a new block, creates a new file entry in the CAR archive
await writer.put({ cid, bytes })
await writer.close()
const inStream = fs.createReadStream('example.car')
// read and parse the entire stream in one go, this will cache the contents of
// the car in memory so is not suitable for large files.
const reader = await CarReader.fromIterable(inStream)
// read the list of roots from the header
const roots = await reader.getRoots()
// retrieve a block, as a { cid:CID, bytes:UInt8Array } pair from the archive
const got = await reader.get(roots[0])
// also possible: for await (const { cid, bytes } of CarIterator.fromIterable(inStream)) { ... }
console.log('Retrieved [%s] from example.car with CID [%s]',
new TextDecoder().decode(got.bytes),
roots[0].toString())
}
example().catch((err) => {
console.error(err)
process.exit(1)
})
Will output:
Retrieved [random meaningless bytes] from example.car with CID [bafkreihwkf6mtnjobdqrkiksr7qhp6tiiqywux64aylunbvmfhzeql2coa]
See the examples directory for more.
@ipld/car
is consumed through factory methods on its different classes. Each
class represents a discrete set of functionality. You should select the classes
that make the most sense for your use-case.
Please be aware that @ipld/car
does not validate that block data matches
the paired CIDs when reading a CAR. See the
verify-car.js example for one possible approach to
validating blocks as they are read. Any CID verification requires that the hash
function that was used to generate the CID be available, the CAR format does
not restrict the allowable multihashes.
The basic CarReader
class is consumed via:
import { CarReader } from '@ipld/car/reader'
Or alternatively: import { CarReader } from '@ipld/car'
. CommonJS require
will also work for the same import paths and references.
CarReader
is useful for relatively small CAR archives as it buffers the
entirety of the archive in memory to provide access to its data. This class is
also suitable in a browser environment. The CarReader
class provides
random-access get(key)
and has(key)
methods as well as iterators for blocks()
] and
cids()
].
CarReader
can be instantiated from a
single Uint8Array
or from
an AsyncIterable
of Uint8Array
s (note that
Node.js streams are AsyncIterable
s and can be consumed in this way).
The CarIndexedReader
class is a special form of CarReader
and can be
consumed in Node.js only (not in the browser) via:
import { CarIndexedReader } from '@ipld/car/indexed-reader'
Or alternatively: import { CarIndexedReader } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
A CarIndexedReader
provides the same functionality as CarReader
but is
instantiated from a path to a CAR file and also
adds a close()
method that must be called when the reader
is no longer required, to clean up resources.
CarIndexedReader
performs a single full-scan of a CAR file, collecting a list
of CID
s and their block positions in the archive. It then performs
random-access reads when blocks are requested via get()
and the blocks()
and
cids()
iterators.
This class may be sutiable for random-access (primarily via has()
and get()
)
to relatively large CAR files.
import { CarBlockIterator } from '@ipld/car/iterator'
// or
import { CarCIDIterator } from '@ipld/car/iterator'
Or alternatively:
import { CarBlockIterator, CarCIDIterator } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
These two classes provide AsyncIterable
s to the blocks or just the CIDs
contained within a CAR archive. These are efficient mechanisms for scanning an
entire CAR archive, regardless of size, if random-access to blocks is not
required.
CarBlockIterator
and CarCIDIterator
can be instantiated from a
single Uint8Array
(see
CarBlockIterator.fromBytes()
and
CarCIDIterator.fromBytes()
) or from
an AsyncIterable
of Uint8Array
s (see
CarBlockIterator.fromIterable()
and
CarCIDIterator.fromIterable()
)—note that
Node.js streams are AsyncIterable
s and can be consumed in this way.
The CarIndexer
class can be used to scan a CAR archive and provide indexing
data on the contents. It can be consumed via:
import CarIndexer from '@ipld/car/indexed-reader'
Or alternatively: import { CarIndexer } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
This class is used within CarIndexedReader
and is only
useful in cases where an external index of a CAR needs to be generated and used.
The index data can also be used with
CarReader.readRaw()
] to fetch block data directly from
a file descriptor using the index data for that block.
A CarWriter
is used to create new CAR archives. It can be consumed via:
import CarWriter from '@ipld/car/writer'
Or alternatively: import { CarWriter } from '@ipld/car'
. CommonJS
require
will also work for the same import paths and references.
Creation of a CarWriter
involves a "channel", or a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair. The writer
side
of the channel is used to put()
blocks, while the out
side of the channel emits the bytes that form the encoded CAR archive.
In Node.js, you can use the
Readable.from()
API to convert the out
AsyncIterable
to a standard Node.js stream, or it can
be directly fed to a
stream.pipeline()
.
class CarReader
async CarReader#getRoots()
async CarReader#has(key)
async CarReader#get(key)
async * CarReader#blocks()
async * CarReader#cids()
async CarReader.fromBytes(bytes)
async CarReader.fromIterable(asyncIterable)
async CarReader.readRaw(fd, blockIndex)
class CarIndexedReader
async CarIndexedReader#getRoots()
async CarIndexedReader#has(key)
async CarIndexedReader#get(key)
async * CarIndexedReader#blocks()
async * CarIndexedReader#cids()
async CarIndexedReader#close()
async CarIndexedReader.fromFile(path)
class CarBlockIterator
async CarBlockIterator#getRoots()
async CarBlockIterator.fromBytes(bytes)
async CarBlockIterator.fromIterable(asyncIterable)
class CarCIDIterator
async CarCIDIterator#getRoots()
async CarCIDIterator.fromBytes(bytes)
async CarCIDIterator.fromIterable(asyncIterable)
class CarIndexer
async CarIndexer#getRoots()
async CarIndexer.fromBytes(bytes)
async CarIndexer.fromIterable(asyncIterable)
class CarWriter
async CarWriter#put(block)
async CarWriter#close()
async CarWriter.create(roots)
async CarWriter.createAppender()
async CarWriter.updateRootsInBytes(bytes, roots)
async CarWriter.updateRootsInFile(fd, roots)
Properties:
version
(number)
: The version number of the CAR referenced by this reader (should be1
).
Provides blockstore-like access to a CAR.
Implements the RootsReader
interface:
getRoots()
. And the BlockReader
interface:
get()
, has()
,
blocks()
(defined as a BlockIterator
) and
cids()
(defined as a CIDIterator
).
Load this class with either import { CarReader } from '@ipld/car/reader'
(const { CarReader } = require('@ipld/car/reader')
). Or
import { CarReader } from '@ipld/car'
(const { CarReader } = require('@ipld/car')
).
The former will likely result in smaller bundle sizes where this is
important.
- Returns:
Promise<CID[]>
Get the list of roots defined by the CAR referenced by this reader. May be
zero or more CID
s.
-
key
(CID)
-
Returns:
Promise<boolean>
Check whether a given CID
exists within the CAR referenced by this
reader.
-
key
(CID)
-
Returns:
Promise<(Block|undefined)>
Fetch a Block
(a { cid:CID, bytes:Uint8Array }
pair) from the CAR
referenced by this reader matching the provided CID
. In the case where
the provided CID
doesn't exist within the CAR, undefined
will be
returned.
- Returns:
AsyncGenerator<Block>
Returns a BlockIterator
(AsyncIterable<Block>
) that iterates over all
of the Block
s ({ cid:CID, bytes:Uint8Array }
pairs) contained within
the CAR referenced by this reader.
- Returns:
AsyncGenerator<CID>
Returns a CIDIterator
(AsyncIterable<CID>
) that iterates over all of
the CID
s contained within the CAR referenced by this reader.
-
bytes
(Uint8Array)
-
Returns:
Promise<CarReader>
: blip blop
Instantiate a CarReader
from a Uint8Array
blob. This performs a
decode fully in memory and maintains the decoded state in memory for full
access to the data via the CarReader
API.
-
asyncIterable
(AsyncIterable<Uint8Array>)
-
Returns:
Promise<CarReader>
Instantiate a CarReader
from a AsyncIterable<Uint8Array>
, such as
a modern Node.js stream.
This performs a decode fully in memory and maintains the decoded state in
memory for full access to the data via the CarReader
API.
Care should be taken for large archives; this API may not be appropriate where memory is a concern or the archive is potentially larger than the amount of memory that the runtime can handle.
-
fd
(fs.promises.FileHandle|number)
: A file descriptor from the Node.jsfs
module. Either an integer, fromfs.open()
or aFileHandle
fromfs.promises.open()
. -
blockIndex
(BlockIndex)
: An index pointing to the location of the Block required. ThisBlockIndex
should take the form:{cid:CID, blockLength:number, blockOffset:number}
. -
Returns:
Promise<Block>
: A{ cid:CID, bytes:Uint8Array }
pair.
Reads a block directly from a file descriptor for an open CAR file. This function is only available in Node.js and not a browser environment.
This function can be used in connection with CarIndexer
which emits
the BlockIndex
objects that are required by this function.
The user is responsible for opening and closing the file used in this call.
Properties:
version
(number)
: The version number of the CAR referenced by this reader (should be1
).
A form of CarReader
that pre-indexes a CAR archive from a file and
provides random access to blocks within the file using the index data. This
function is only available in Node.js and not a browser environment.
For large CAR files, using this form of CarReader
can be singificantly more
efficient in terms of memory. The index consists of a list of CID
s and
their location within the archive (see CarIndexer
). For large numbers
of blocks, this index can also occupy a significant amount of memory. In some
cases it may be necessary to expand the memory capacity of a Node.js instance
to allow this index to fit. (e.g. by running with
NODE_OPTIONS="--max-old-space-size=16384"
).
As an CarIndexedReader
instance maintains an open file descriptor for its
CAR file, an additional CarReader#close
method is attached. This
must be called to have full clean-up of resources after use.
Load this class with either
import { CarIndexedReader } from '@ipld/car/indexed-reader'
(const { CarIndexedReader } = require('@ipld/car/indexed-reader')
). Or
import { CarIndexedReader } from '@ipld/car'
(const { CarIndexedReader } = require('@ipld/car')
). The former will likely
result in smaller bundle sizes where this is important.
- Returns:
Promise<CID[]>
-
key
(CID)
-
Returns:
Promise<boolean>
See CarReader#has
-
key
(CID)
-
Returns:
Promise<(Block|undefined)>
See CarReader#get
- Returns:
AsyncGenerator<Block>
See CarReader#blocks
- Returns:
AsyncGenerator<CID>
See CarReader#cids
- Returns:
Promise<void>
Close the underlying file descriptor maintained by this CarIndexedReader
.
This must be called for proper resource clean-up to occur.
-
path
(string)
-
Returns:
Promise<CarIndexedReader>
Instantiate an CarIndexedReader
from a file with the provided
path
. The CAR file is first indexed with a full path that collects CID
s
and block locations. This index is maintained in memory. Subsequent reads
operate on a read-only file descriptor, fetching the block from its in-file
location.
For large archives, the initial indexing may take some time. The returned
Promise
will resolve only after this is complete.
Properties:
version
(number)
: The version number of the CAR referenced by this iterator (should be1
).
Provides an iterator over all of the Block
s in a CAR. Implements a
BlockIterator
interface, or AsyncIterable<Block>
. Where a Block
is
a { cid:CID, bytes:Uint8Array }
pair.
As an implementer of AsyncIterable
, this class can be used directly in a
for await (const block of iterator) {}
loop. Where the iterator
is
constructed using CarBlockiterator.fromBytes
or
CarBlockiterator.fromIterable
.
An iteration can only be performce once per instantiation.
CarBlockIterator
also implements the RootsReader
interface and provides
the getRoots()
method.
Load this class with either
import { CarBlockIterator } from '@ipld/car/iterator'
(const { CarBlockIterator } = require('@ipld/car/iterator')
). Or
import { CarBlockIterator } from '@ipld/car'
(const { CarBlockIterator } = require('@ipld/car')
).
- Returns:
Promise<CID[]>
Get the list of roots defined by the CAR referenced by this iterator. May be
zero or more CID
s.
-
bytes
(Uint8Array)
-
Returns:
Promise<CarBlockIterator>
Instantiate a CarBlockIterator
from a Uint8Array
blob. Rather
than decoding the entire byte array prior to returning the iterator, as in
CarReader.fromBytes
, only the header is decoded and the remainder
of the CAR is parsed as the Block
s as yielded.
-
asyncIterable
(AsyncIterable<Uint8Array>)
-
Returns:
Promise<CarBlockIterator>
Instantiate a CarBlockIterator
from a AsyncIterable<Uint8Array>
,
such as a modern Node.js stream.
Rather than decoding the entire byte array prior to returning the iterator,
as in CarReader.fromIterable
, only the header is decoded and the
remainder of the CAR is parsed as the Block
s as yielded.
Properties:
version
(number)
: The version number of the CAR referenced by this iterator (should be1
).
Provides an iterator over all of the CID
s in a CAR. Implements a
CIDIterator
interface, or AsyncIterable<CID>
. Similar to
CarBlockIterator
but only yields the CIDs in the CAR.
As an implementer of AsyncIterable
, this class can be used directly in a
for await (const cid of iterator) {}
loop. Where the iterator
is
constructed using CarCIDiterator.fromBytes
or
CarCIDiterator.fromIterable
.
An iteration can only be performce once per instantiation.
CarCIDIterator
also implements the RootsReader
interface and provides
the getRoots()
method.
Load this class with either
import { CarCIDIterator } from '@ipld/car/iterator'
(const { CarCIDIterator } = require('@ipld/car/iterator')
). Or
import { CarCIDIterator } from '@ipld/car'
(const { CarCIDIterator } = require('@ipld/car')
).
- Returns:
Promise<CID[]>
Get the list of roots defined by the CAR referenced by this iterator. May be
zero or more CID
s.
-
bytes
(Uint8Array)
-
Returns:
Promise<CarCIDIterator>
Instantiate a CarCIDIterator
from a Uint8Array
blob. Rather
than decoding the entire byte array prior to returning the iterator, as in
CarReader.fromBytes
, only the header is decoded and the remainder
of the CAR is parsed as the CID
s as yielded.
-
asyncIterable
(AsyncIterable<Uint8Array>)
-
Returns:
Promise<CarCIDIterator>
Instantiate a CarCIDIterator
from a AsyncIterable<Uint8Array>
,
such as a modern Node.js stream.
Rather than decoding the entire byte array prior to returning the iterator,
as in CarReader.fromIterable
, only the header is decoded and the
remainder of the CAR is parsed as the CID
s as yielded.
Properties:
version
(number)
: The version number of the CAR referenced by this reader (should be1
).
Provides an iterator over all of the Block
s in a CAR, returning their CIDs
and byte-location information. Implements an AsyncIterable<BlockIndex>
.
Where a BlockIndex
is a
{ cid:CID, length:number, offset:number, blockLength:number, blockOffset:number }
.
As an implementer of AsyncIterable
, this class can be used directly in a
for await (const blockIndex of iterator) {}
loop. Where the iterator
is
constructed using CarIndexer.fromBytes
or
CarIndexer.fromIterable
.
An iteration can only be performce once per instantiation.
CarIndexer
also implements the RootsReader
interface and provides
the getRoots()
method.
Load this class with either
import { CarIndexer } from '@ipld/car/indexer'
(const { CarIndexer } = require('@ipld/car/indexer')
). Or
import { CarIndexer } from '@ipld/car'
(const { CarIndexer } = require('@ipld/car')
). The former will likely
result in smaller bundle sizes where this is important.
- Returns:
Promise<CID[]>
Get the list of roots defined by the CAR referenced by this indexer. May be
zero or more CID
s.
-
bytes
(Uint8Array)
-
Returns:
Promise<CarIndexer>
Instantiate a CarIndexer
from a Uint8Array
blob. Only the header
is decoded initially, the remainder is processed and emitted via the
iterator as it is consumed.
-
asyncIterable
(AsyncIterable<Uint8Array>)
-
Returns:
Promise<CarIndexer>
Instantiate a CarIndexer
from a AsyncIterable<Uint8Array>
,
such as a modern Node.js stream.
is decoded initially, the remainder is processed and emitted via the
iterator as it is consumed.
Provides a writer interface for the creation of CAR files.
Creation of a CarWriter
involves the instatiation of an input / output pair
in the form of a WriterChannel
, which is a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair. These two
components form what can be thought of as a stream-like interface. The
writer
component (an instantiated CarWriter
), has methods to
put()
new blocks and close()
the writing operation (finalising the CAR archive). The out
component is
an AsyncIterable
that yields the bytes of the archive. This can be
redirected to a file or other sink. In Node.js, you can use the
Readable.from()
API to convert this to a standard Node.js stream, or it can be directly fed
to a
stream.pipeline()
.
The channel will provide a form of backpressure. The Promise
from a
write()
won't resolve until the resulting data is drained from the out
iterable.
It is also possible to ignore the Promise
from write()
calls and allow
the generated data to queue in memory. This should be avoided for large CAR
archives of course due to the memory costs and potential for memory overflow.
Load this class with either
import { CarWriter } from '@ipld/car/writer'
(const { CarWriter } = require('@ipld/car/writer')
). Or
import { CarWriter } from '@ipld/car'
(const { CarWriter } = require('@ipld/car')
). The former will likely
result in smaller bundle sizes where this is important.
-
block
(Block)
: A{ cid:CID, bytes:Uint8Array }
pair. -
Returns:
Promise<void>
: The returned promise will only resolve once the bytes this block generates are written to theout
iterable.
Write a Block
(a { cid:CID, bytes:Uint8Array }
pair) to the archive.
- Returns:
Promise<void>
Finalise the CAR archive and signal that the out
iterable should end once
any remaining bytes are written.
-
roots
(CID[]|CID|void)
-
Returns:
WriterChannel
: The channel takes the form of{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
.
Create a new CAR writer "channel" which consists of a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair.
- Returns:
WriterChannel
: The channel takes the form of{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
.
Create a new CAR appender "channel" which consists of a
{ writer:CarWriter, out:AsyncIterable<Uint8Array> }
pair.
This appender does not consider roots and does not produce a CAR header.
It is designed to append blocks to an existing CAR archive. It is
expected that out
will be concatenated onto the end of an existing
archive that already has a properly formatted header.
-
bytes
(Uint8Array)
-
roots
(CID[])
: A new list of roots to replace the existing list in the CAR header. The new header must take up the same number of bytes as the existing header, so the roots should collectively be the same byte length as the existing roots. -
Returns:
Promise<Uint8Array>
Update the list of roots in the header of an existing CAR as represented in a Uint8Array.
This operation is an overwrite, the total length of the CAR will not be modified. A rejection will occur if the new header will not be the same length as the existing header, in which case the CAR will not be modified. It is the responsibility of the user to ensure that the roots being replaced encode as the same length as the new roots.
The byte array passed in an argument will be modified and also returned upon successful modification.
-
fd
(fs.promises.FileHandle|number)
: A file descriptor from the Node.jsfs
module. Either an integer, fromfs.open()
or aFileHandle
fromfs.promises.open()
. -
roots
(CID[])
: A new list of roots to replace the existing list in the CAR header. The new header must take up the same number of bytes as the existing header, so the roots should collectively be the same byte length as the existing roots. -
Returns:
Promise<void>
Update the list of roots in the header of an existing CAR file. The first argument must be a file descriptor for CAR file that is open in read and write mode (not append).
This operation is an overwrite, the total length of the CAR will not be modified. A rejection will occur if the new header will not be the same length as the existing header, in which case the CAR will not be modified. It is the responsibility of the user to ensure that the roots being replaced encode as the same length as the new roots.
This function is only available in Node.js and not a browser environment.
Licensed under either of
- Apache 2.0, (LICENSE-APACHE / http://www.apache.org/licenses/LICENSE-2.0)
- MIT (LICENSE-MIT / http://opensource.org/licenses/MIT)
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.