Skip to content

Commit

Permalink
Merge pull request #248 from jeromekelleher/last-docs-updates-and-rel…
Browse files Browse the repository at this point in the history
…ease

Last docs updates and release
  • Loading branch information
jeromekelleher authored Jun 9, 2024
2 parents 6e1ec0d + 37acdc1 commit 31a5935
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 2 deletions.
35 changes: 34 additions & 1 deletion docs/vcf2zarr/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,17 @@ convert your data, basically providing different levels of
convenience and flexibility corresponding to what you might
need for small, intermediate and large datasets.

:::{warning}
The documentation of vcf2zarr is under development, and
some bits are more polished than others. This "tutorial"
is experimental, and will likely evolve into a slightly
different format in the near future. It is
a work in progress and incomplete. The
{ref}`sec-vcf2zarr-cli-ref` should be complete
and authoritative, however.
:::


## Small dataset

The simplest way to convert VCF data to Zarr is to use the
Expand Down Expand Up @@ -229,11 +240,33 @@ granularity). You should be careful to use this value in your scripts


Once ``dexplode-init`` is done and we know how many partitions we have,
we need to call ``dexplode-partition`` this number of times.
we need to call
{ref}`dexplode-partition<cmd-vcf2zarr-dexplode-partition>` this number of times:

```{code-cell}
vcf2zarr dexplode-partition sample-dist.icf 0
vcf2zarr dexplode-partition sample-dist.icf 1
vcf2zarr dexplode-partition sample-dist.icf 2
```

This is not how it would be done in practise of course: you would
use your cluster scheduler of choice to dispatch these operations.
:::{todo}
Document how to do this conveniently over some popular schedulers.
:::

:::{tip}
Use the ``--one-based`` argument in cases in which it's more convenient
to index the partitions from 1 to n, rather than 0 to n - 1.
:::

Finally we need to call
{ref}`dexplode-finalise<cmd-vcf2zarr-dexplode-finalise>`:
```{code-cell}
vcf2zarr dexplode-finalise sample-dist.icf
```

:::{todo}
Document the process for dencode, noting the information output about
memory requirements.
:::
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ dependencies = [
]
requires-python = ">=3.9"
classifiers = [
"Development Status :: 3 - Alpha",
"Development Status :: 4 - Beta",
"License :: OSI Approved :: Apache Software License",
"Operating System :: POSIX",
"Operating System :: POSIX :: Linux",
Expand Down

0 comments on commit 31a5935

Please sign in to comment.