## Concatenate cvs-files with similar column names

### Example

Consider the following example, which exists of two small .tsv files with the same headers:

In [1]:
# FILE 1:
!cat testdata1.tsv

Title 1	Title 2	Title 3
1	a	08/18/07
2	b	08/19/07
3	c	08/20/07


In [2]:
# FILE 2:
!cat testdata2.tsv

Title 1	Title 2	Title 3
40	D	08/21/07
50	E	08/22/07
60	F	08/23/07


We want to concatenate these two files, keeping the headers of the first file only. We could do this in linux command line, but this looks not really easy to handle (so, just as showcase):

In [3]:
! cat testdata1.tsv > concatenated.tsv
! tail -n +2 testdata2.tsv >> concatenated.tsv
! cat concatenated.tsv

Title 1	Title 2	Title 3
1	a	08/18/07
2	b	08/19/07
3	c	08/20/07
40	D	08/21/07
50	E	08/22/07
60	F	08/23/07


### Python implementation

https://github.com/dhellmann/csvcat is already existing and can be used for this purpose, with the possibility to skip the headers already implemented as follows :

In [4]:
!csvcat --skip-headers testdata1.tsv testdata2.tsv testdata1.tsv

Title 1	Title 2	Title 3
1	a	08/18/07
2	b	08/19/07
3	c	08/20/07
40	D	08/21/07
50	E	08/22/07
60	F	08/23/07
1	a	08/18/07
2	b	08/19/07
3	c	08/20/07


To write it to a file, we have to specify a output file:

In [5]:
!csvcat --skip-headers testdata1.tsv testdata2.tsv testdata1.tsv > concatenated_file.tsv

(remark that the `!` before the command is only added here to make it a command line feature from within this notebook environment, so do not add it when using the command line, aka *black screen*)

### Manual

#### Running the tool

To use the package for our purpose, we need to consider following elements:
* put the files together in a folder
* run the command on the command line with the appropriate file names and output to a file bringing it together:
    ```
    csvcat --skip-headers filename1 filename2 filename3 filename4 filename5  > concatented_file.tsv
    ```

This should do the job!

#### Installation on windows - current workflow

* Install miniconda, python 3.5: http://conda.pydata.org/miniconda.html
* create a conda environment (or use the general/root environment) to install the `csvcat` tool:
    * Just type in the command line:
                ```
                pip install csvcat
                ```
        
    * Following is not needed anymore, pypi is updated:
        * go to https://github.com/dhellmann/csvcat or the fork https://github.com/stijnvanhoey/csvcat
        * download the code (`git clone` or download as a zip-file), remark: *do not use pip install, because pypi not updated yet*
        * on the command line, from **within** the folder with the code, type on the command line:
            ```
            python setup.py install
            ```