# Session 5a - Collate with CollateX
## Plain Text

Finally, we will start using CollateX!

In this exercise, follow the instructions here: 
- read the Markdown cells
- execute the Code cells (the ones with `In [a number]:` on their left).

If you don't remember how to execute cells in a Notebook, check the Jupyter Notebook tutorial (add link).

### Delete the outputs
In this notebook, you may already have outputs - the results of the exercises. We want to start from scratch, let's delete the outputs!

- Go to the menu 'Kernel'
- click on 'Restart & Clear Outputs' and confirm when Jupyter asks for it
- Wait a few seconds, a blue string appears telling 'Kernel ready' ('Noyau prêt'); if you don't see it, don't worry, it is so quick that you might have lost it. But the Notebook is ready again.

## Update Collatex

We want to make sure that we are using the latest version of CollateX. You don't need to do it every time, but it is best to do it regularly - so we are running this at the beginning of the notebook:

In [None]:
!pip install --upgrade collatex

## Importing CollateX
Before we can use CollateX, we need to import it as a module:

In [None]:
from collatex import * # the * means we import everything

## Importing CollateX

Did it work? If you got an error, you may need to install the [python-levenshtein C library](https://pypi.org/project/python-Levenshtein/)

In [None]:
# install the levenshtein library
!pip install python-levenshtein


### Installing python-levenshtein

The easiest way to get the python-levenshtein library on your computer is to install it from the Anaconda Navigator:

1. open the Anaconda Navigator and select environments
2. in the dropdown menu of the right-hand panel, select 'Not installed'
3. in the search box, search for 'levenshtein'
4. the package appears in the list! Select it by clicking the checkbox
5. click 'Apply' at the bottom of the screen

<img src="images\anaconda-python-levenshtein-steps.jpg" width="75%" style="display:block;margin:auto;">

In [None]:
# now we can import CollateX - let's start collating!
from collatex import *

## Step 1 - Create a Collation Object

We create a Collation object with this slightly hermetic line of code:

`collation = Collation()`

Here the lower case `collation` is the variable name, you can choose any name. We simply tell CollateX to create a new empty Collation "instance" by saying `Collation()`. Collation is a sort of special data type, that was created especially for CollateX.


In [None]:
collation = Collation()

## Step 2 - Add Witnesses
The Collation object will contain the witnesses that we want to collate.

Each witness gets:
- a name (or sigil) to identify it. The name can be a single letter, number, or a longer name
- a text that will be collated

In [None]:
# we use the function add_plain_witness()
collation.add_plain_witness('A', 'Bladorthin, Dwarves, and Mr Baggins.')
collation.add_plain_witness('B', 'Bladorthin, dwarves and Mr Baggins, [...]')
collation.add_plain_witness('C', 'Gandalf, dwarves and Mr Baggins!')

# we can check how many witnesses are in our Collation object
len(collation.witnesses)

## Step 3 - Collate

We give our Collation object to the function `collate()`: it will collate all the witnesses that we just added. 

We save the result into a variable.

In [None]:
result = collate(collation)

## Step 4 - Visualize the Result
Use the function `print()` -- and voilà! 

We have successfully collated our witnesses.

In [None]:
print(result)

We have just done the most basic collation possible. But the `collate()` function has more options, that will modify how the result looks like.

We will see two of these options:
- layout: it can be either 'horizontal' (by default), or 'vertical'
- segmentation: it can be either `True` (by default), or `False`

In [None]:
# layout changes the orientation of the table
result2 = collate(collation, layout='vertical')
print(result2)

In [None]:
# segmentation changes how the words are separated (or not) in the table
result3 = collate(collation, layout='vertical', segmentation=False)
print(result3)

## Recap and Exercise

Before moving forward and see how to collate texts stored in files and discover the various outputs that CollateX provide, let's recap what we've done and exercise a bit.


**First**, create a new Markdown cell at the end of this Notebook (you could also create a new Notebook, but we'll save time by working in this one). Write in the new cell something like My CollateX test, so you know that this is your tests from that cell onwards. You can use the Markdown cells to document what is happening around them.

**Then**, create a Code cell and copy the code here below: this is all CollateX needs to collate some texts, the same instructions we gave it before but all together.

**Now** run the cell a first time and see the results.

**Make changes** and see how the output changes when you run the cell again. Change one thing at a time: this way, if you get an error message, it will be easier to debug the code. Try the following changes:
 1. Change the text for each witness
 1. Add a new witness
 2. Set the segmentation option to True (you will see that it is the same as deleting it)
 4. It is also possible to change the sigil for each witness. The sigil is the abbreviation used for refering to a witness, here 'A', 'B', 'C'.


In [None]:
from collatex import *
collation = Collation()
collation.add_plain_witness('A', 'Bladorthin, Dwarves, and Mr Baggins.')
collation.add_plain_witness('B', 'Bladorthin, dwarves and Mr Baggins, [...]')
collation.add_plain_witness('C', 'Gandalf, dwarves and Mr Baggins!')
result = collate(collation, layout='vertical', segmentation=False)
print(result)