# "nbdev2 - first steps"
> "by Jeremy Howard and Hamel Hussain"
- show_tags: true
- toc: true
- branch: master
- badges: false
- comments: true
- categories: [nbdev, fastai, jupyter]
- image: images/icons/fastai.png

fastai has just released [nbdev2](https://nbdev.fast.ai/).

This is a complete rewrite with quarto. I like how they displayed features in that card

![](https://nbdev.fast.ai/images/card.png)

# Support

There is a [nbdev section](https://forums.fast.ai/c/nbdev/48) in fastai forum.

There is a channel #nbdev-help at fastai discord. Never posted there.

And [issues page](https://github.com/fastai/nbdev/issues) in github fastai/nbdev repo.

# Walkthrough

There is a 90 min video: [nbdev tutorial](https://www.youtube.com/watch?v=l7zS8Ld4_iA&ab_channel=JeremyHoward) -- zero to published project in 90 minutes

I follow here this tutorial.

Here are the big steps:

## create github project

- create a new project with github: [dataset_tools](https://github.com/castorfou/dataset_tools). Give a description it will be reused by nbdev

## integrate `nbdev` in your python environment

- create a local conda env `dataset_tools` with what is required to develop this library

In [12]:
!cat /home/guillaume/_conda_env/dataset_tools.txt

conda remove --name dataset_tools --all
conda create --name dataset_tools python=3.9
conda activate dataset_tools
conda install ipykernel
python -m ipykernel install --user --name=dataset_tools
pip install nbdev -U
pip install pandas


In [11]:
import sys
!{sys.prefix}/bin/pip list|grep nbdev

nbdev              2.2.10


## clone repo and turned it into a nbdev repo

- clone repo `dataset_tools` and turn it into a nbdev repo


```bash
git clone git@github.com:castorfou/dataset_tools.git
conda activate dataset_tools
cd dataset_tools
```

## nbdev_ commands are ready to be used

* nbdev can be used from here. For example `nbdev_help`  to display all nbdev_ commands and what it does. And more detail can be got with `-h`: `nbdev_new -h`

In [14]:
!{sys.prefix}/bin/nbdev_help

[1m[94mnbdev_bump_version[0m              Increment version in settings.ini by one
[1m[94mnbdev_changelog[0m                 Create a CHANGELOG.md file from closed and labeled GitHub issues
[1m[94mnbdev_clean[0m                     Clean all notebooks in `fname` to avoid merge conflicts
[1m[94mnbdev_conda[0m                     Create a `meta.yaml` file ready to be built into a package, and optionally build and upload it
[1m[94mnbdev_create_config[0m             Create a config file.
[1m[94mnbdev_deploy[0m                    Deploy docs to GitHub Pages
[1m[94mnbdev_docs[0m                      Create Quarto docs and README.md
[1m[94mnbdev_export[0m                    Export notebooks in `path` to Python modules
[1m[94mnbdev_filter[0m                    A notebook filter for Quarto
[1m[94mnbdev_fix[0m                       Create working notebook from conflicted notebook `nbname`
[1m[94mnbdev_help[0m                      Show help for all console scripts

* `nbdev_new`. It is creating the structure and files such as settings.ini.
* from base environment we can start `jupyter notebook`. It is advised to install nb_extensions (`pip install jupyter_contrib_nbextensions`), and activate TOC2. Open `00_core.ipynb` with `dataset_tools` kernel. Rename `00_core.ipynb` --> `00_container.ipynb`

## and `#| ` prefix in notebooks as well

Jeremy explains then what are `#|` used by quarto and nbdev.

And for example `#| hide` will allow to be executed but hide in your documentation.

Actually from a single notebook, you have 3 usages:
* the notebook by itself -  all cells are executed, whatever are the prefix `#|` that you display on cells
* the python file -  only the cells with `#| export` will be published in a python file referenced as `#| default_exp <name of python file>`. A new file is genreated when `nbdev_export` is called.
* the documentation - all cells are used, except the one started with `#| hide`. Seems to be dynamically generated (when `nbdev_preview` is running?). `#| export` are handled specifically: if you have import, nothing is displayed. If you have code, definitions and docstrings are exported, and arguments as well.

There is an easy way to describe arguments of a function.

Just make some indentation with comments such as in

```python
    def __init__(self, 
                 cle : str, # la clé du container
                 dataset : pd.DataFrame = None, # le dataset
                 colonnes_a_masquer : list = [], # les colonnes à masquer
                 colonnes_a_conserver : list = [] # les colonnes qui ne seront pas transformées
                ):
```

### show_doc

and we can directly see the effect of it by calling `show_doc` (`show_doc(Container)`). You can even call show_doc on code not written with nbdev, or not even written by you.

In [17]:
from nbdev.showdoc import *
import pandas as pd
show_doc(pd.DataFrame)

  else: warn(msg)
  else: warn(msg)


---

### DataFrame

>      DataFrame (data=None, index:Axes|None=None, columns:Axes|None=None,
>                 dtype:Dtype|None=None, copy:bool|None=None)

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

Data structure also contains labeled axes (rows and columns).
Arithmetic operations align on both row and column labels. Can be
thought of as a dict-like container for Series objects. The primary
pandas data structure.

|    | **Type** | **Default** | **Details** |
| -- | -------- | ----------- | ----------- |
| data | NoneType | None | Dict can contain Series, arrays, constants, dataclass or list-like objects. If<br>data is a dict, column order follows insertion-order. If a dict contains Series<br>which have an index defined, it is aligned by its index.<br><br>.. versionchanged:: 0.25.0<br>   If data is a list of dicts, column order follows insertion-order. |
| index | Axes \| None | None | Index to use for resulting frame. Will default to RangeIndex if<br>no indexing information part of input data and no index provided. |
| columns | Axes \| None | None | Column labels to use for resulting frame when data does not have them,<br>defaulting to RangeIndex(0, 1, 2, ..., n). If data contains column labels,<br>will perform column selection instead. |
| dtype | Dtype \| None | None | Data type to force. Only a single dtype is allowed. If None, infer. |
| copy | bool \| None | None | Copy data from inputs.<br>For dict data, the default of None behaves like ``copy=True``.  For DataFrame<br>or 2d ndarray input, the default of None behaves like ``copy=False``.<br><br>.. versionchanged:: 1.3.0 |

### unit testing

# Gitlab integration


because this is the platform we use at Michelin, I will need to make it work with our internal gitlab instance.

There is on-going work to make it happen:

* from Hamel Husain - enhancement request [Support gitlab](https://github.com/fastai/nbdev/issues/945)
* and from fastai community in forum: [Nbdev and Gitlab (source links)](https://forums.fast.ai/t/nbdev-and-gitlab-source-links/98867), [Example: nbdev on Gitlab](https://forums.fast.ai/t/example-nbdev-on-gitlab/98890)