# Preparations
This binder has OpenRefine, openrefine-client and a Jupyter server proxy preinstalled. We need to start the OpenRefine server proxy by opening the urlpath `/openrefine`. It is a bit complicated doing it directly from this notebook but the following commands will do that for you:

In [None]:
notebook_url="$(jupyter notebook list | grep -o -E 'http\S+')"
openrefine_url="${notebook_url/?token/openrefine?token}"
until wget -q -O - ${openrefine_url} | cat | grep -q -o "OpenRefine" ; do sleep 1; done

For your convenience we compute the URL to the OpenRefine server (the graphical user interface is still available)...

In [None]:
openrefine_url="${openrefine_url/http:\/\/0.0.0.0:8888/https:\/\/hub.gke.mybinder.org}"
echo $openrefine_url

... and we compute the URL to the Jupyter dashboard (where stored files can be found)

In [None]:
notebook_url="${notebook_url/http:\/\/0.0.0.0:8888/https:\/\/hub.gke.mybinder.org}"
echo $notebook_url

We will store some files so it is clearer to use a new folder for each run

In [None]:
workspace=$(date +%Y%m%d_%H%M%S)
mkdir -p ~/$workspace && cd ~/$workspace
echo "${notebook_url/?token/tree\/${workspace}?token}"

# Execute openrefine-client
The openrefine-client is pre-installed in ~./local/bin. Therefore you can execute the program by typing `openrefine-client`.

In [None]:
openrefine-client

# Create project

Download sample data

In [None]:
wget -nv https://github.com/felixlohmeier/openrefine-kimws2019/raw/master/doaj-article-sample.csv

Importing file into OpenRefine

In [None]:
openrefine-client --create doaj-article-sample.csv

# List projects

In [None]:
openrefine-client --list

# Get metadata

In [None]:
openrefine-client --info "doaj-article-sample"

Save the project id (we will use that later to generate a link to your OpenRefine project)

In [None]:
projectid="$(openrefine-client --info "doaj-article-sample" | head -n 1 | cut -c 5-17)"
echo $projectid

# Apply Transformations

Download sample json file (the content of this file was previously extracted via Undo/Redo history in the OpenRefine graphical user interface)

In [None]:
wget -nv https://raw.githubusercontent.com/felixlohmeier/openrefine-kimws2019/master/doaj-openrefine.json

Apply json file to project doaj-article-sample (this takes a few seconds)

In [None]:
time openrefine-client --apply doaj-openrefine.json "doaj-article-sample"

Check results in OpenRefine

In [None]:
project_url="${openrefine_url/openrefine?token/openrefine\/project?project=${projectid}&token}"
echo "$project_url" 

# Export data

Export in csv format

In [None]:
time openrefine-client --export "doaj-article-sample" --output=doaj-export.csv

First record intentionally contains the full json dump from VIAF reconciliation service (this was a task in the lesson)

In [None]:
head -n 3 doaj-export.csv

# Templating Export

In [None]:
time openrefine-client --export "doaj-article-sample" \
--template="    { \"DOI\" : {{jsonize(cells[\"DOI\"].value)}}, \"Title\" : {{jsonize(cells[\"Title\"].value)}}, \"Authors\" : {{jsonize(cells[\"Authors\"].value.split(\"|\"))}} }" \
--prefix="{ \"rows\" : [
" \
--rowSeparator=",
" \
--suffix="
    ] 
}" \
> doaj-export.json

In [None]:
head -n 3 doaj-export.json

Templating export supports filter queries (e.g. only spanish language)

In [None]:
time openrefine-client --export "doaj-article-sample" \
--template="    { \"DOI\" : {{jsonize(cells[\"DOI\"].value)}}, \"Title\" : {{jsonize(cells[\"Title\"].value)}}, \"Authors\" : {{jsonize(cells[\"Authors\"].value.split(\"|\"))}} }" \
--prefix="{ \"rows\" : [
" \
--rowSeparator=",
" \
--suffix="
    ] 
}" \
--filterColumn=Language \
--filterQuery=ES \
> doaj-export-es.json

In [None]:
head -n 3 doaj-export-es.json

And there is also an option to save all records to individual files

In [None]:
time openrefine-client --export "doaj-article-sample" \
--template="    { \"DOI\" : {{jsonize(cells[\"DOI\"].value)}}, \"Title\" : {{jsonize(cells[\"Title\"].value)}}, \"Authors\" : {{jsonize(cells[\"Authors\"].value.split(\"|\"))}} }" \
--prefix="{ \"rows\" : [
" \
--rowSeparator=",
" \
--suffix="
    ] 
}" \
--filterColumn=Language \
--filterQuery=ES \
--splitToFiles=true \
--output=doaj-export-es.json

In [None]:
ls

Check results in Jupyter dashboard

In [None]:
echo "${notebook_url/?token/tree\/${workspace}?token}"

# Delete project

In [None]:
time openrefine-client --delete "doaj-article-sample"

# Cleanup
If something goes wrong (e.g. multiple projects with the same name) here is a shortcut to delete all existing projects in OpenRefine

In [None]:
openrefine-client --list > projects.tmp
projectids=($(cut -c 2-14 "projects.tmp")) && rm projects.tmp
for i in "${projectids[@]}" ; do
    openrefine-client --delete $i
done

# Give me more
Consult the help screen for further options

In [None]:
openrefine-client --help

The [openrefine-client](https://github.com/opencultureconsulting/openrefine-client) is available as a one file executable for Windows, Mac OS and Linux. Client and server can be executed on different machines (host and port of the OpenRefine server can be specified, e.g. `-H 127.0.0.1 -P 80`.