-
Notifications
You must be signed in to change notification settings - Fork 9
Installation Guide
- Installation Guide
- Frontend
- Elasticsearch Docker Container
- Elasticsearch Server
- ETL process installation
To start the frontend:
-
download and install Node.js.
-
open a terminal and install Angular CLI globally:
npm install -g @angular/cli
-
Navigate into the app folder and install packages:
npm ci
-
for development mode:
npm run start:de
fromartbrowser/app
folder, app will be available in a browser onlocalhost:4200/de
or you can use any other language supported by openartbrowser (eg.npm run start:en
) -
for deployment:
npm run build-locale
on server and copy files to target directory
Frontend configuration:
-
default Elasticsearch url is 'https://openartbrowser.org/api/{lang}/_search'
-
To change Elasticsearch url to another server, change the elasticEnvironment variable in in
app/src/environments/environment.ts
-
If you want to use the local elasticsearch docker container:
npm run start_docker
The repository provides a Dockerfile for building and running a local elasticsearch instance inside a docker container:
- install docker
- build and run the image via the provided helper script:
etl/docker_elastic.sh
oretl/docker_elastic.bat
- the script will kill old instances of the docker container
More information about the Dockerfile can be found here.
To install the Elasticsearch server pick the correct installation method for your operating system here: Installing Elasticsearch.
On the server Elasticsearch was installed with the advanced packaging tool (apt). The currently installed version is 7.13.2.
The Elasticsearch directory layout can be viewed here: Elasticsearch directory layout
To configure the server you need to make changes in the elasticsearch.yml
file which is located in /etc/elasticsearch
(or somewhere else depending on your OS). The elasticsearch.yml
file can be found in the repository /openartbrowser/etl/upload_to_elasticsearch. To enable the snapshot feature, which is used for index swapping, you need to provide a backup directory. On the server we use /var/lib/elasticsearch/backup
this also depends on the operating system but it is necessary to set this backup directory in order to run the elasticsearch_helper.py script.
After all changes were written to elasticsearch.yml
you need to restart the Elasticsearch server which can be done by following commands:
Stop the Elasticsearch server
sudo systemctl stop elasticsearch.service
Start the Elasticsearch server
sudo systemctl start elasticsearch.service
Further information about this can be found here: Starting Elasticsearch
The elasticsearch standard config uses half the system memory as heap. This is too much for our configuration and so the ram usage on both staging and production is limited. Currently the servers are configured to use 4GB minimum and 8GB maximum. This configuration is stored in "/etc/elasticsearch/jvm.options.d/jvm.options". The configuration was done according to this advanced configuration guide to set JVM heap.
Upgrading the elasticsearch clusters can be done with a full cluster restart upgrade. Checking the version can be done (on the server) with curl -X GET "http://localhost:9200"
The Nginx server forwards every search query to the Elasticsearch server (Nginx is the reverse proxy for Elasticsearch). The reason for this is that the Elasticsearch server must not be accessed from the outside via its REST interface. If this would be possible anyone could delete indices, documents, snapshots and so on.
The configuration for this can be found in /etc/nginx/sites-enabled/default
With the new multilanguage feature the endpoints have changed and are not the same as in the picture above but the concept stays the same.
The scripts which extract data from wikidata for the openartbrowser are using the programming language Python. To execute them following programs are required:
- Python3 version >3.7 available at https://www.python.org/downloads/
-
sudo apt-get install python3
-
- Python3 package manager pip3
-
sudo apt-get install pip3
-
Installation on ubuntu (with apt):
- First add Personal Package Archives (PPA) for nodejs with curl
-
sudo apt-get install curl
-
curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash -
-
- Install nodejs with apt-get
-
sudo apt-get install nodejs
-
The versions are recommendations older versions may work.
When python is installed the dependencies for the openartbrowser code can be installed.
In order to install the dependencies run the install_etl.sh script.
./install_etl.sh
or on Windows
./install_etl.bat
To install all required python packages execute the following command in the script directory. If you run install_etl.sh this will be performed within the script.
pip3 install -r requirements.txt
To be able to execute the art_ontology_crawler.py script you first have to configure the pywikibot installation. In the repository a /openartbrowser/etl/user-config.py is provided which configures the user of the pywikibot. It is necessary that this file exists in order to run the script. The script will always use the user-config.py from the directory in which you executed the art_ontology_crawler.py script. So always execute this script from the /etl
directory.
python3 data_extraction/art_ontology_crawler.py
There are several options how to configure the pywikibot: https://www.mediawiki.org/wiki/Manual:Pywikibot/user-config.py#Location .
If you want to use pywikibot with an MediaWiki account you can follow the tutorial from wikidata on the following link: https://www.wikidata.org/wiki/Wikidata:Pywikibot_-_Python_3_Tutorial/Setting_up_Shop .
The python scripts reference own modules to avoid code duplication. It is necessary to set environment variables that this procedure works. On Unix based systems the PYTHONPATH variable has to be set to the openartbrowser/etl directory on your opened shell session.
export PYTHONPATH="${PYTHONPATH}:openartbrowser/etl"
The Unix environment variables are dependend on the shell you use so you have to look that up if the above doesn't work for you. In the openartbrowser/etl/scripts/install_etl.sh are examples on how to set the environment variable up for the bash shell.
You may also set the PYWIKIBOT_DIR variable to the openartbrowser/etl directory to be able to execute the script from another directory than openartbrowser/etl, but this is optional and not used on the server anyways.
On Windows the system environment variable (not user variable) can be set via the GUI or via the terminal like this
setx PYTHONPATH "%PYTHONPATH%;%CD%" /M
Also the PYWIKIBOT_DIR variable can be set but this is optional.
setx PYWIKIBOT_DIR "%CD%" /M
If above doesn't work for you please use the GUI for it.