# Run EICrecon using podio over TCP
(n.b. This is based on the fabric iperf3 example.)

This will setup nodes at both CERN and Washington DC to demonstrate transferring podio events and processing them using the EICrecon software from ePIC.


## Import the FABlib Library


In [1]:
from fabrictestbed_extensions.fablib.fablib import FablibManager as fablib_manager

fablib = fablib_manager()

fablib.show_config();

0,1
Orchestrator,orchestrator.fabric-testbed.net
Credential Manager,cm.fabric-testbed.net
Core API,uis.fabric-testbed.net
Token File,/home/fabric/.tokens.json
Project ID,a7818636-1fa1-4e77-bb03-d171598b0862
Bastion Host,bastion.fabric-testbed.net
Bastion Username,davidl_0004580836
Bastion Private Key File,/home/fabric/work/fabric_config/fabric-bastion-key
Slice Public Key File,/home/fabric/work/fabric_config/slice_key.pub
Slice Private Key File,/home/fabric/work/fabric_config/slice_key


## Create the Experiment Slice

The following creates two nodes with basic NICs connected to an isolated local Ethernet. 

Patience here. This will take a while not only to set up the slice, but to pull the docker image which is >=11.3GB

In [2]:
slice_name = 'EICreconTCP'
# [site1, site2] = fablib.get_random_sites(count=2)
site1 = 'CERN'
site2 = 'WASH'
print(f"Sites: {site1}, {site2}")

node1_name='Node1'
node2_name='Node2'

Sites: CERN, WASH


In [3]:
#Create Slice
slice = fablib.new_slice(name=slice_name)

# Node1
node1 = slice.add_node(name=node1_name, cores=4, disk=50, ram=16, site=site1, image='docker_rocky_8')
node1.add_fabnet()
node1.add_post_boot_upload_directory('node_tools','.')
node1.add_post_boot_execute('sudo node_tools/host_tune.sh')
node1.add_post_boot_execute('node_tools/enable_docker.sh {{ _self_.image }} ')
node1.add_post_boot_execute('docker pull eicweb/jug_xl:nightly ')


# Node2
node2 = slice.add_node(name=node2_name, cores=4, disk=50, ram=16, site=site2, image='docker_rocky_8')
node2.add_fabnet()
node2.add_post_boot_upload_directory('node_tools','.')
node2.add_post_boot_execute('sudo node_tools/host_tune.sh')
node2.add_post_boot_execute('node_tools/enable_docker.sh {{ _self_.image }} ')
node2.add_post_boot_execute('docker pull eicweb/jug_xl:nightly ')

#Submit Slice Request
slice.submit();

Exception: Submit request error: return_status Status.FAILURE, slice_reservations: (500)
Reason: INTERNAL SERVER ERROR
HTTP response headers: HTTPHeaderDict({'Server': 'nginx/1.21.6', 'Date': 'Sun, 14 Apr 2024 01:18:51 GMT', 'Content-Type': 'text/html; charset=utf-8', 'Content-Length': '206', 'Connection': 'keep-alive', 'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Headers': 'DNT, User-Agent, X-Requested-With, If-Modified-Since, Cache-Control, Content-Type, Range, Authorization', 'Access-Control-Allow-Methods': 'GET, POST, PUT, PATCH, DELETE, OPTIONS', 'Access-Control-Allow-Origin': '*', 'Access-Control-Expose-Headers': 'Content-Length, Content-Range, X-Error', 'X-Error': 'Slice EICreconTCP already exists'})
HTTP response body: b'{\n    "errors": [\n        {\n            "details": "Slice EICreconTCP already exists",\n            "message": "Internal Server Error"\n        }\n    ],\n    "size": 1,\n    "status": 500,\n    "type": "error"\n}'


# Build software

This will clone and build the software needed for the test. It is built on both nodes though really only podio2tcp is needed on one of them.

This will take quite a long time since a lot of code is compiled and only a couple of cores are available.

In [4]:
%cd /home/fabric/work/RTDP
slice = fablib.get_slice(slice_name)
node1 = slice.get_node(name=node1_name)        
node2 = slice.get_node(name=node2_name)           
node1.upload_file('build_all.sh', 'build_all.sh')
node2.upload_file('build_all.sh', 'build_all.sh')
node1.execute("chmod +x build_all.sh", quiet=True)
node2.execute("chmod +x build_all.sh", quiet=True)

stdout2, stderr2 = node2.execute("ls -l", quiet=False)

cmd  = "docker run --rm "
cmd += "--network host "
cmd += "-v ${PWD}:/work "
cmd += "eicweb/jug_xl:nightly "
cmd += "/work/build_all.sh"

stdout2, stderr2 = node1.execute(cmd, quiet=True, output_file=f"{node1.get_name()}.log");
stdout2, stderr2 = node2.execute(cmd, quiet=True, output_file=f"{node2.get_name()}.log");


/home/fabric/work/RTDP
total 56
-rwxrwxr-x.  1 rocky rocky 1708 Apr 14 01:19 build_all.sh
drwxr-xr-x.  5 root  root  4096 Apr 14 00:34 EICrecon
drwxr-xr-x.  6 root  root  4096 Apr 14 00:34 EICrecon.build
drwxr-xr-x.  8 root  root  4096 Apr 14 00:29 EICrecon.src
drwxr-xr-x.  7 root  root  4096 Apr 14 00:29 JANA2
drwxr-xr-x.  4 root  root  4096 Apr 14 00:29 JANA2.build
drwxr-xr-x. 10 root  root  4096 Apr 14 00:27 JANA2.src
drwxrwxr-x.  2 rocky rocky 4096 Apr 13 22:51 node_tools
drwxr-xr-x.  7 root  root  4096 Apr 14 00:27 podio
drwxr-xr-x.  3 root  root  4096 Apr 14 00:27 podio2tcp.build
drwxr-xr-x.  9 root  root  4096 Apr 14 00:27 podio.build
drwxr-xr-x. 11 root  root  4096 Apr 14 00:24 podio.src
drwxr-xr-x.  3 root  root  4096 Apr 14 00:34 podiostream.build
drwxr-xr-x.  9 root  root  4096 Apr 14 00:27 SRO-RTDP


# Copy input file to source host

TODO: The podiostr input file is currently copied from my bastion account. It would be better to have this pulled from xrootd by the remote nodes.

The input file was copied to my bastion account with:

~~~
  scp -J davidl@scilogin.jlab.org davidl@ifarm9:/home/davidl/work_eic/2024.04.11.podio_stream/simout.1000.edmhep.root.podiostr
~~~

Instructions for creating it can be found here: https://github.com/JeffersonLab/SRO-RTDP/tree/main/src/utilities/cpp/podio2tcp

In [None]:
node1.upload_file('simout.1000.edmhep.root.podiostr', 'simout.1000.edmhep.root.podiostr')

# Setup scripts to run test

Numerous environment variables need to be set up inside the docker container before running the software. This is easiest to do by just putting them in a script. Copy the `setenv.sh` script from here to each of the nodes.

The client node (the one running eicrecon and consuming events) will need to know the IP address of the server node. Copy this into the setenv.sh scripts on the remote node(s) so it is set as an environment variable that can be easily used in the eicrecon command.



In [8]:
# Get IP addresses
node1_addr = node1.get_interface(network_name=f'FABNET_IPv4_{node1.get_site()}').get_ip_addr()
node2_addr = node2.get_interface(network_name=f'FABNET_IPv4_{node2.get_site()}').get_ip_addr()
print(f'node1: {node1_addr}')
print(f'node2: {node2_addr}')

# Upload the setenv.sh file
node1.upload_file('setenv.sh', 'setenv.sh')
node2.upload_file('setenv.sh', 'setenv.sh')

# Append setting the PODIOHOST to the setenv.sh script
cmd = f"echo \"export PODIOHOST={node1_addr}\" >> setenv.sh"
node2.execute(cmd, quiet=False, output_file=f"{node2.get_name()}.log");

# Copy run scripts
node1.upload_file('run_source.sh', 'run_source.sh')
node1.execute("chmod +x run_source.sh", quiet=True)
node2.upload_file('run_eicrecon.sh', 'run_eicrecon.sh')
node2.execute("chmod +x run_eicrecon.sh", quiet=True)

node1: 10.143.4.2
node2: 10.133.7.2


('', '')

# Run processes

At this point, it is probably easier to run the processes manually in separate terminals connected to each host. Grab the ssh commands for each node from the top of this notebook and establish a connection to each in separate terminals. The run docker like this:

On node1 (CERN):
~~~
docker run -it --rm --network host -v ${PWD}:/work eicweb/jug_xl:nightly /work/run_source.sh
~~~

On node2 (WASH):
~~~
docker run -it --rm --network host -v ${PWD}:/work eicweb/jug_xl:nightly /work/run_eicrecon.sh
~~~



In [9]:
# node1.execute("docker run -it --rm --network host -v ${PWD}:/work eicweb/jug_xl:nightly /work/run_source.sh", quiet=True)
# node2.execute("docker run -it --rm --network host -v ${PWD}:/work eicweb/jug_xl:nightly /work/run_eicrecon.sh", quiet=True)

## Delete the Slice

Please delete your slice when you are done with your experiment.

In [10]:
# slice = fablib.get_slice(slice_name)
# slice.delete()
