Skip to content

Commit

Permalink
Merge branch 'devel'
Browse files Browse the repository at this point in the history
  • Loading branch information
maystery committed Feb 18, 2021
2 parents 659228d + b744045 commit 7c3f439
Show file tree
Hide file tree
Showing 8 changed files with 47 additions and 121 deletions.
4 changes: 2 additions & 2 deletions sphinx/source/tutorial-bigdata-ai.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Big Data and AI applications
Apache Hadoop cluster
~~~~~~~~~~~~~~~~~~~~~

This tutorial sets up a complete Apache Hadoop (version **2.10.1**) infrastructure. It contains a Hadoop Master node and Hadoop Slave worker nodes, which can be scaled up or down. To register Hadoop Slave nodes Consul is used.
This tutorial sets up a complete Apache Hadoop (version **3.3.0**) infrastructure. It contains a Hadoop Master node and Hadoop Slave worker nodes, which can be scaled up or down. To register Hadoop Slave nodes Consul is used.

**Features**

Expand Down Expand Up @@ -114,7 +114,7 @@ You can download the example as `tutorial.examples.hadoop-cluster <https://raw.g
#. You can check the health and statistics of the cluster through the following web pages:

- Health of nodes: ``http://[HadoopMasterIP]:50070``
- Health of nodes: ``http://[HadoopMasterIP]:9870``
- Job statistics: ``http://[HadoopMasterIP]:8088``

#. To launch a Hadoop MapReduce job copy your input and executable files to the Hadoop Master node, and perform the submission described `here <https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html>`_.
Expand Down
25 changes: 4 additions & 21 deletions sphinx/source/tutorial-building-clusters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -267,14 +267,13 @@ DataAvenue cluster

Data Avenue is a data storage management service that enables to access different types of storage resources (including S3, sftp, GridFTP, iRODS, SRM servers) using a uniform interface. The provided REST API allows of performing all the typical storage operations such as creating folders/buckets, renaming or deleting files/folders, uploading/downloading files, or copying/moving files/folders between different storage resources, respectively, even simply using 'curl' from command line. Data Avenue automatically translates users' REST commands to the appropriate storage protocols, and manages long-running data transfers in the background.

In this tutorial we establish a cluster with two nodes types. On the DataAvenue node the DataAvenue application will run, and on a predefined number of storage nodes an S3 storage will run, in order to be able to try DataAvenue file transfer software such as making buckets, download or copy files. We used Ceph and Docker components to build-up the cluster.
In this tutorial we establish a cluster with two nodes types. On the DataAvenue node the DataAvenue application will run, and an S3 storage will run, in order to be able to try DataAvenue file transfer software such as making buckets, download or copy files. We used MinIO and Docker components to build-up the cluster.

**Features**

- creating two types of nodes through contextualisation
- using the nova resource handler
- using parameters to scale up storage nodes


**Prerequisites**

- accessing an Occopus compatible interface
Expand Down Expand Up @@ -304,29 +303,14 @@ The following steps are suggested to be performed:
=========== ============= ====================
TCP 22 SSH
TCP 80 HTTP
TCP 443 HTTPS
TCP 8080 DA service
=========== ============= ====================

#. Make sure your authentication information is set correctly in your authentication file. You must set your authentication data for the ``resource`` you would like to use. Setting authentication information is described :ref:`here <authentication>`.

#. Update the number of storage nodes if necessary. For this, edit the ``infra-dataavenue.yaml`` file and modify the min and max parameter under the scaling keyword. Scaling is the interval in which the number of nodes can change (min, max). Currently, the minimum is set to 2 (which will be the initial number at startup).

.. code:: yaml
- &S
name: storage
type: storage_node
scaling:
min: 2
.. important::

Important: Keep in mind that Occopus has to start at least one node from each node type to work properly and scaling can be applied only for storage nodes in this example!


#. Optionally edit the "variables" section of the ``infra-dataavenue.yaml`` file. Set the following attributes:

- ``storage_user_name`` is the name of the S3 storage user
- ``access_key`` is the access key of the S3 storage user
- ``secret_key`` is the secret key of the S3 storage user

Expand Down Expand Up @@ -356,15 +340,14 @@ The following steps are suggested to be performed:
192.168.xxx.xxx (34b07a23-a26a-4a42-a5f4-73966b8ed23f)
storage:
192.168.xxx.xxx (29b98290-c6f4-4ae7-95ca-b91a9baf2ea8)
192.168.xxx.xxx (3ba43b6e-bcec-46ed-bd90-6a352749db5d)
db0f0047-f7e6-428e-a10d-3b8f7dbdb4d4
#. On the S3 storage nodes a user with predefined parameters will be created. The ``access_key`` will be the Username and the ``secret_key`` will be the Password, which are predefined in the ``infra-dataavenue.yaml`` file. Save user credentials into a file named ``credentials`` use the above command:

.. code:: bash
echo -e 'X-Key: 1a7e159a-ffd8-49c8-8b40-549870c70e73\nX-Username: A8Q2WPCWAELW61RWDGO8\nX-Password: FWd1mccBfnw6VHa2vod98NEQktRCYlCronxbO1aQ' > credentials
echo -e 'X-Key: dataavenue-key\nX-Username: A8Q2WPCWAELW61RWDGO8\nX-Password: FWd1mccBfnw6VHa2vod98NEQktRCYlCronxbO1aQ' > credentials
.. note::
This step will be useful to shorten the curl commands later when using DataAvenue!
Expand Down
Binary file modified tutorials/dataavenue-cluster.tar.gz
Binary file not shown.
Empty file.
5 changes: 2 additions & 3 deletions tutorials/dataavenue-cluster/infra-dataavenue.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,9 @@ nodes:
name: storage
type: storage_node
scaling:
min: 2
max: 10
min: 1
max: 1
variables:
storage_user_name: testuser
access_key: A8Q2WPCWAELW61RWDGO8
secret_key: FWd1mccBfnw6VHa2vod98NEQktRCYlCronxbO1aQ

Expand Down
62 changes: 14 additions & 48 deletions tutorials/dataavenue-cluster/nodes/cloud_init_dataavenue.yaml
Original file line number Diff line number Diff line change
@@ -1,59 +1,25 @@
#cloud-config
write_files:
################################
# SCRIPT TO INSTALL DOCKER
################################
- path: /bin/deploy-docker.sh
content: |
#!/bin/bash
echo "Install DOCKER starts."
set -x
apt-get update
apt-get install -y --no-install-recommends linux-image-extra-$(uname -r) linux-image-extra-virtual apt-transport-https ca-certificates curl software-properties-common
echo deb http://apt.dockerproject.org/repo ubuntu-trusty main > /etc/apt/sources.list.d/docker.list
curl -fsSL https://apt.dockerproject.org/gpg | apt-key add -
add-apt-repository "deb https://apt.dockerproject.org/repo/ ubuntu-$(lsb_release -cs) main"
apt-get update
apt-get install -y docker-engine
echo "DOCKER_OPTS='-H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock'" > /etc/default/docker
service docker restart
echo "Install DOCKER stops."
permissions: '755'

#####################################
# SCRIPT TO INSTALL DOCKER-COMPOSE
#####################################
- path: /bin/deploy-docker-compose.sh
content: |
#!/bin/bash
set -x
echo "Install DOCKER-COMPOSE starts."
sudo curl -L https://github.com/docker/compose/releases/download/1.16.1/docker-compose-`uname -s`-`uname -m` -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
echo "Install DOCKER-COMPOSE stops."
permissions: '755'

################################
# SCRIPT TO INSTALL DATAAVENUE
# SCRIPT TO INSTALL AND DEPLOY DATAAVENUE
################################
- path: /bin/deploy-dataavenue.sh
content: |
#!/bin/bash
set -x
echo "Install DATAAVENUE starts."
wget -O data-avenue-docker-compose-latest.tar.gz https://nextcloud.sztaki.hu/s/EiNnjwDjR9xfTQZ/download --directory /home/ubuntu/
cd /home/ubuntu
tar zxvf data-avenue-docker-compose-latest.tar.gz
cd /home/ubuntu/data-avenue-docker-compose
docker-compose up > dataavenue.out &
echo "Install DATAAVENUE stops."
echo "Deploying DATAAVENUE..."
apt-get update
git clone https://github.com/SZTAKI-LPDS/data-avenue.git /home/ubuntu/data-avenue
mvn -f /home/ubuntu/data-avenue/pom.xml package
cp /home/ubuntu/data-avenue/data-avenue.core.war/target/dataavenue.war /home/ubuntu/data-avenue/data-avenue.docker-compose/dataavenue/webapps/
docker-compose -f /home/ubuntu/data-avenue/data-avenue.docker-compose/docker-compose.yml up -d
echo "Done."
permissions: '755'

packages:
- git
- maven
- openjdk-8-jdk
- docker-compose
runcmd:
#Install DOCKER
- /bin/deploy-docker.sh
#Install DOCKER-COMPOSE
- /bin/deploy-docker-compose.sh
#Run Dataavenue
- /bin/deploy-dataavenue.sh
- echo "DATAAAVENUE NODES'S CONTEXTUALIZATION DONE."
- echo "DATAAVENUE NODE CONTEXTUALIZATION DONE."
35 changes: 10 additions & 25 deletions tutorials/dataavenue-cluster/nodes/cloud_init_storage.yaml
Original file line number Diff line number Diff line change
@@ -1,34 +1,19 @@
#cloud-config
write_files:
################################
# SCRIPT TO INSTALL DOCKER
# SCRIPT TO INSTALL AND DEPLOY MINIO
################################
- path: /bin/deploy-docker.sh
- path: /bin/deploy-minio.sh
content: |
#!/bin/bash
echo "Install DOCKER starts."
set -x
echo "Deploying MINIO..."
apt-get update
apt-get install -y --no-install-recommends linux-image-extra-$(uname -r) linux-image-extra-virtual apt-transport-https ca-certificates curl software-properties-common
echo deb http://apt.dockerproject.org/repo ubuntu-trusty main > /etc/apt/sources.list.d/docker.list
curl -fsSL https://apt.dockerproject.org/gpg | apt-key add -
add-apt-repository "deb https://apt.dockerproject.org/repo/ ubuntu-$(lsb_release -cs) main"
apt-get update
apt-get install -y docker-engine
echo "DOCKER_OPTS='-H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock'" > /etc/default/docker
service docker restart
echo "Install DOCKER stops."
docker run -d -v /mnt:/data -p 80:9000/tcp --name minio -e "MINIO_ACCESS_KEY={{variables.access_key}}" -e "MINIO_SECRET_KEY={{variables.secret_key}}" minio/minio server /data
echo "Done."
permissions: '755'

packages:
- docker-compose
runcmd:
#Install DOCKER
- /bin/deploy-docker.sh
#Install CEPH NETWORK
- docker network inspect ceph 2>&1 > /dev/null || docker network create --driver bridge --subnet 172.16.13.0/28 --gateway 172.16.13.1 ceph
- mkdir -p /etc/ceph
#Run CEPH CONTAINER
- docker run -d --name=ceph --net=ceph -e CEPH_DEMO_UID={{variables.storage_user_name}} -e CEPH_DEMO_ACCESS_KEY={{variables.access_key}} -e CEPH_DEMO_SECRET_KEY={{variables.secret_key}} -e MON_IP=172.16.13.2 -e CEPH_NETWORK=172.16.13.0/28 -e CEPH_PUBLIC_NETWORK=172.16.13.0/28 -p 80:80 -p 5000:5000 -p 6789:6789 -p 6800-6805:6800-6805 -v /etc/ceph:/etc/ceph ceph/demo:tag-build-master-jewel-ubuntu-16.04
#- docker exec ceph ceph -s
#Create user to the S3 storage
#- docker exec ceph radosgw-admin user create --uid={{variables.storage_user_name}} --display-name="{{variables.storage_user_name}}" --access-key={{variables.access_key}} --secret={{variables.secret_key}} > .env
- echo "STORAGE NODE'S CONTEXTUALISATION DONE"
#Run MINIO
- /bin/deploy-minio.sh
- echo "STORAGE NODE CONTEXTUALISATION DONE"
37 changes: 15 additions & 22 deletions tutorials/dataavenue-cluster/nodes/node_definitions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,38 +2,31 @@
-
resource:
type: nova
endpoint: replace_with_endpoint_of_nova_interface_of_your_cloud
project_id: replace_with_projectid_to_use
endpoint: https://sztaki.cloud.mta.hu:5000/v3
project_id: a9c30db63ddf47a98045ef9c726c7436
user_domain_name: Default
image_id: replace_with_id_of_your_image_on_your_target_cloud
network_id: replace_with_id_of_network_on_your_target_cloud
flavor_name: replace_with_id_of_the_flavor_on_your_target_cloud
key_name: replace_with_name_of_keypair_or_remove
security_groups:
-
replace_with_security_group_to_add_or_remove_section
floating_ip: add_yes_if_you_need_floating_ip_or_remove
floating_ip_pool: replace_with_name_of_floating_ip_pool_or_remove
image_id: 6bba6dc3-b6d5-4b15-942e-61e0ef2f93cb
network_id: 01efee1c-858c-4047-a48a-e2fab056f82a
flavor_name: 3
key_name: attila-key
security_groups: [Open]
contextualisation:
type: cloudinit
context_template: !yaml_import
url: file://cloud_init_dataavenue.yaml

'node_def:storage_node':
-
resource:
type: nova
endpoint: replace_with_endpoint_of_nova_interface_of_your_cloud
project_id: replace_with_projectid_to_use
endpoint: https://sztaki.cloud.mta.hu:5000/v3
project_id: a9c30db63ddf47a98045ef9c726c7436
user_domain_name: Default
image_id: replace_with_id_of_your_image_on_your_target_cloud
network_id: replace_with_id_of_network_on_your_target_cloud
flavor_name: replace_with_id_of_the_flavor_on_your_target_cloud
key_name: replace_with_name_of_keypair_or_remove
security_groups:
-
replace_with_security_group_to_add_or_remove_section
floating_ip: add_yes_if_you_need_floating_ip_or_remove
floating_ip_pool: replace_with_name_of_floating_ip_pool_or_remove
image_id: 6bba6dc3-b6d5-4b15-942e-61e0ef2f93cb
network_id: 01efee1c-858c-4047-a48a-e2fab056f82a
flavor_name: 3
key_name: attila-key
security_groups: [Open]
contextualisation:
type: cloudinit
context_template: !yaml_import
Expand Down

0 comments on commit 7c3f439

Please sign in to comment.