Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions Dockerfile-auth
Original file line number Diff line number Diff line change
@@ -1,19 +1,22 @@
FROM python:3.9.0
MAINTAINER Komal Thareja<komal.thareja@gmail.com>

HANDLERS_VER=1.0rc2

RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
VOLUME ["/usr/src/app"]

EXPOSE 11000

COPY . /usr/src/app/
COPY requirements.txt /usr/src/app/
COPY fabric_cf /usr/src/app/fabric_cf
RUN pip3 install --no-cache-dir -r requirements.txt
RUN mkdir -p "/etc/fabric/message_bus/schema"
RUN mkdir -p "/etc/fabric/actor/config"
RUN mkdir -p "/var/log/actor"
RUN cp /usr/local/lib/python3.9/site-packages/fabric_mb/message_bus/schema/*.avsc /etc/fabric/message_bus/schema
RUN pip3 install fabric-am-handlers==1.0rc1
RUN pip3 install fabric-am-handlers==HANDLERS_VER

ENTRYPOINT ["python3"]
CMD ["-m", "fabric_cf.authority"]
3 changes: 2 additions & 1 deletion Dockerfile-broker
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ VOLUME ["/usr/src/app"]

EXPOSE 11000

COPY . /usr/src/app/
COPY requirements.txt /usr/src/app/
COPY fabric_cf /usr/src/app/fabric_cf
RUN pip3 install --no-cache-dir -r requirements.txt
RUN mkdir -p "/etc/fabric/message_bus/schema"
RUN mkdir -p "/etc/fabric/actor/config"
Expand Down
3 changes: 2 additions & 1 deletion Dockerfile-cf
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ VOLUME ["/usr/src/app"]

EXPOSE 11000

COPY . /usr/src/app/
COPY requirements.txt /usr/src/app/
COPY fabric_cf /usr/src/app/fabric_cf
RUN pip3 install --no-cache-dir -r requirements.txt
RUN mkdir -p "/etc/fabric/message_bus/schema"
RUN mkdir -p "/etc/fabric/actor/config"
Expand Down
3 changes: 2 additions & 1 deletion Dockerfile-orchestrator
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ VOLUME ["/usr/src/app"]
EXPOSE 11000
EXPOSE 8700

COPY . /usr/src/app/
COPY requirements.txt /usr/src/app/
COPY fabric_cf /usr/src/app/fabric_cf
RUN pip3 install --no-cache-dir -r requirements.txt
RUN mkdir -p "/etc/fabric/message_bus/schema"
RUN mkdir -p "/etc/fabric/actor/config"
Expand Down
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,14 @@ Broker is an agent of CF that collects resource availability information from mu
AM is a CF agent responsible for managing aggregate resources. Is under the control of the owner of the aggregate. Provides promises of resources to brokers and controllers/ orchestrators. More details can be found [here](fabric_cf/authority/Readme.md)

## Orchestrator
Orchestrator is an agent of CF that makes allocation decisions (embedding) of user requests into available resources. Communicates with user to collect slice requests, communicates with broker or aggregate managers to collect resource promises, communicates with aggregate managers to provision promised resources. Creates slices, configures resources, maintains their state, modifies slices and slivers. More details can be found [here](fabric_cf/orchestrator/Readme.md)
Orchestrator is an agent of CF that makes allocation decisions (embedding) of user requests into available resources. Communicates with user to collect slice requests, communicates with broker or aggregate managers to collect resource promises, communicates with aggregate managers to provision promised resources. Creates slices, configures resources, maintains their state, modifies slices and slivers. More details can be found [here](fabric_cf/orchestrator/README.md)

## Architecture
The following diagram depicts an overall architecture for the Control Framework.
![Architecture](./images/cf.png)

## Requirements
Python 3.7+
Python 3.9+

## Build Docker Images

Expand Down
2 changes: 1 addition & 1 deletion fabric_cf/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__VERSION__ = "1.0rc2"
__VERSION__ = "1.0rc3"
6 changes: 6 additions & 0 deletions fabric_cf/actor/boot/configuration.py
Original file line number Diff line number Diff line change
Expand Up @@ -398,6 +398,12 @@ def get_global_config(self) -> GlobalConfig:
"""
return self.global_config

def get_log_config(self) -> dict:
"""
Return Log config
"""
return self.global_config.get_logging()

def get_runtime_config(self) -> dict:
"""
Return Runtime Config
Expand Down
2 changes: 2 additions & 0 deletions fabric_cf/actor/core/common/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -274,3 +274,5 @@ class Constants:
DEFAULT_VLAN_OFFSET = 10
VLAN_START = 1
VLAN_END = 4096

CONFIG_PROPERTIES_FILE = "config.properties.file"
20 changes: 13 additions & 7 deletions fabric_cf/actor/core/container/globals.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,22 +69,21 @@ def __init__(self):
self.lock = threading.Lock()
self.jwt_validator = None

def make_logger(self):
def make_logger(self, *, log_config: dict = None):
"""
Detects the path and level for the log file from the actor config and sets
up a logger. Instead of detecting the path and/or level from the
config, a custom path and/or level for the log file can be passed as
optional arguments.
:param log_path: Path to custom log file
:param log_level: Custom log level
:param log_config: Log config
:return: logging.Logger object
"""
if log_config is None:
if self.config is None:
raise RuntimeError('No config information available')

# Get the log path
if self.config is None:
raise RuntimeError('No config information available')
log_config = self.config.get_global_config().get_logging()
log_config = self.config.get_global_config().get_logging()
if log_config is None:
raise RuntimeError('No logging config information available')

Expand Down Expand Up @@ -198,6 +197,13 @@ def get_config(self) -> Configuration:
raise InitializationException(Constants.UNINITIALIZED_STATE)
return self.config

def get_log_config(self) -> dict:
"""
Get the Log configuration
@return dict
"""
return self.get_config().get_log_config()

def get_kafka_config_admin_client(self) -> dict:
"""
Get Kafka Config Admin Client
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ def __init__(self):
self.thread = None
self.future_lock = threading.Condition()
self.stopped = False
from fabric_cf.actor.core.container.globals import GlobalsSingleton
self.log_config = GlobalsSingleton.get().get_log_config()

def __getstate__(self):
state = self.__dict__.copy()
Expand Down Expand Up @@ -136,7 +138,7 @@ def invoke_handler(self, unit: ConfigToken, operation: str):

handler_class = ReflectionUtils.create_instance_with_params(module_name=handler.get_module_name(),
class_name=handler.get_class_name())
handler_obj = handler_class(self.logger, handler.get_properties())
handler_obj = handler_class(self.log_config, handler.get_properties())

future = None
if operation == Constants.TARGET_CREATE:
Expand Down
35 changes: 23 additions & 12 deletions fabric_cf/actor/handlers/handler_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,13 @@
#
#
# Author: Komal Thareja (kthare10@renci.org)
import logging
from abc import ABC, abstractmethod
from typing import Tuple

import yaml

from fabric_cf.actor.core.common.constants import Constants
from fabric_cf.actor.core.plugins.handlers.config_token import ConfigToken


Expand All @@ -36,18 +38,27 @@ class ConfigurationException(Exception):


class HandlerBase(ABC):
@staticmethod
def load_config(path):
"""
Read config file
"""
if path is None:
raise ConfigurationException("No data source has been specified")
print("Reading config file: {}".format(path))
config_dict = None
with open(path) as f:
config_dict = yaml.safe_load(f)
return config_dict
def __init__(self, log_config: dict, properties: dict):
self.log_config = log_config
self.properties = properties
self.logger = None
self.config = None

def get_config(self) -> dict:
if self.config is None:
config_properties_file = self.properties.get(Constants.CONFIG_PROPERTIES_FILE, None)
if config_properties_file is None:
raise ConfigurationException("No data source has been specified")
self.get_logger().debug(f"Reading config file: {config_properties_file}")
with open(config_properties_file) as f:
self.config = yaml.safe_load(f)
return self.config

def get_logger(self) -> logging.Logger:
if self.logger is None:
from fabric_cf.actor.core.container.globals import GlobalsSingleton
self.logger = GlobalsSingleton.get().make_logger(log_config=self.log_config)
return self.logger

@abstractmethod
def create(self, unit: ConfigToken) -> Tuple[dict, ConfigToken]:
Expand Down
37 changes: 18 additions & 19 deletions fabric_cf/actor/handlers/no_op_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
#
#
# Author: Komal Thareja (kthare10@renci.org)
import time
import traceback
from typing import Tuple

Expand All @@ -33,62 +32,62 @@


class NoOpHandler(HandlerBase):
def __init__(self, logger, properties: dict):
self.logger = logger
self.properties = properties
def __init__(self, log_config: dict, properties: dict):
super().__init__(log_config=log_config, properties=properties)

def create(self, unit: ConfigToken) -> Tuple[dict, ConfigToken]:
result = None
try:
self.logger.info(f"Create invoked for unit: {unit}")
unit.sliver.state = 'active'
unit.sliver.instance_name = 'instance_001'
unit.sliver.management_ip = '1.2.3.4'
self.get_logger().info(f"Create invoked for unit: {unit}")
sliver = unit.get_sliver()
sliver.state = 'active'
sliver.instance_name = 'instance_001'
sliver.management_ip = '1.2.3.4'
result = {Constants.PROPERTY_TARGET_NAME: Constants.TARGET_CREATE,
Constants.PROPERTY_TARGET_RESULT_CODE: Constants.RESULT_CODE_OK,
Constants.PROPERTY_ACTION_SEQUENCE_NUMBER: 0}
except Exception as e:
result = {Constants.PROPERTY_TARGET_NAME: Constants.TARGET_CREATE,
Constants.PROPERTY_TARGET_RESULT_CODE: Constants.RESULT_CODE_EXCEPTION,
Constants.PROPERTY_ACTION_SEQUENCE_NUMBER: 0}
self.logger.error(e)
self.logger.error(traceback.format_exc())
self.get_logger().error(e)
self.get_logger().error(traceback.format_exc())
finally:

self.logger.info(f"Create completed")
self.get_logger().info(f"Create completed")
return result, unit

def delete(self, unit: ConfigToken) -> Tuple[dict, ConfigToken]:
result = None
try:
self.logger.info(f"Delete invoked for unit: {unit}")
self.get_logger().info(f"Delete invoked for unit: {unit}")
result = {Constants.PROPERTY_TARGET_NAME: Constants.TARGET_DELETE,
Constants.PROPERTY_TARGET_RESULT_CODE: Constants.RESULT_CODE_OK,
Constants.PROPERTY_ACTION_SEQUENCE_NUMBER: 0}
except Exception as e:
result = {Constants.PROPERTY_TARGET_NAME: Constants.TARGET_DELETE,
Constants.PROPERTY_TARGET_RESULT_CODE: Constants.RESULT_CODE_EXCEPTION,
Constants.PROPERTY_ACTION_SEQUENCE_NUMBER: 0}
self.logger.error(e)
self.logger.error(traceback.format_exc())
self.get_logger().error(e)
self.get_logger().error(traceback.format_exc())
finally:

self.logger.info(f"Delete completed")
self.get_logger().info(f"Delete completed")
return result, unit

def modify(self, unit: ConfigToken) -> Tuple[dict, ConfigToken]:
result = None
try:
self.logger.info(f"Modify invoked for unit: {unit}")
self.get_logger().info(f"Modify invoked for unit: {unit}")
result = {Constants.PROPERTY_TARGET_NAME: Constants.TARGET_MODIFY,
Constants.PROPERTY_TARGET_RESULT_CODE: Constants.RESULT_CODE_OK,
Constants.PROPERTY_ACTION_SEQUENCE_NUMBER: 0}
except Exception as e:
self.logger.error(e)
self.logger.error(traceback.format_exc())
self.get_logger().error(e)
self.get_logger().error(traceback.format_exc())
result = {Constants.PROPERTY_TARGET_NAME: Constants.TARGET_MODIFY,
Constants.PROPERTY_TARGET_RESULT_CODE: Constants.RESULT_CODE_EXCEPTION,
Constants.PROPERTY_ACTION_SEQUENCE_NUMBER: 0}
finally:
self.logger.info(f"Modify completed")
self.get_logger().info(f"Modify completed")
return result, unit
29 changes: 21 additions & 8 deletions fabric_cf/authority/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,28 @@
An aggregate manager(AM) controls access to the substrate components. It controls some set of infrastructure resources in a particular site consisting of a set of servers, storage units, network elements or other components under common ownership and control. AMs inform brokers about available resources by passing to the resource advertisement information models. AMs may be associated with more than one broker and the partitioning of resources between brokers is the decision left to the AM. Oversubscription is possible, depending on the deployment needs.
FABRIC enables a substrate provider to outsource resource arbitration and calendar scheduling to a broker. By delegating resources to the broker, the AM consents to the broker’s policies, and agrees to try to honor reservations issued by the broker if the user has authorization on the AM.

Besides common code, each AM type has specific plugins that determine its resource allocation behavior (Resource Management Policy) and the specific actions it takes to provision a sliver (Resource Handler). Both plugins are invoked by AM common core code based on the resource type or type of request being considered.
Besides, common code each AM type has specific plugins that determine its resource allocation behavior (Resource Management Policy) and the specific actions it takes to provision a sliver (Resource Handler). Both plugins are invoked by AM common core code based on the resource type or type of request being considered. More information on AM handlers can be found [here](https://github.com/fabric-testbed/AMHandlers).

AM runs as a set of four container depicted in the picture below.
![AM Pod](../../images/am-pod.png)

- AM: runs the Control Framework AM
- Postgres: database maintains slices and reservation information
- Neo4j: Aggregate Substrate information i.e. Aggregate Resource Model is maintained in Neo4j
- PDP: Policy Definition point used by AM to authorize user requests

An overview of AM thread model is shown below:
![Thread Model](../../images/am.png)

- Main : spawns all threads, loads config, starts prometheus exporter
- Actor Clock : delivers a periodic event to Actor Main thread based on the time interval configured
- Actor : Kernel thread responsible for processing various requested operations on slices/reservaations
- Kafka Producer : Thread pool responsible for sending outgoing messages from AM over Kafka
- Timer : Timer thread to timeout requests such as claim
- Kafka Consumer : Consumer thread responsible for processing incoming messages for AM over Kafka
- Ansible Processor : Responsible for invoking Handler depending on the resource type
- Handler Process pool : Process pool for running handler ansible scripts

NOTE: Authority container is still built on Pyhon3.8 because of an open BUG on Python 3.9 which causes ansible failures.
https://github.com/dask/distributed/issues/4168
## Configuration
`config.site.am.yaml` depicts an example config file for an Aggregate Manager.
### Pre-requisites
Expand All @@ -31,14 +49,9 @@ Run the `setup.sh` script to set up an Aggregate Manager. User is expected to sp
- Path to Aggregate Resource Model i.e. graphml
- Path to Handler Config File

#### Production
```
./setup.sh site1-am password ./config.site.am.yaml ../../neo4j/RENCI-ad.graphml ./vm_handler_config.yml
```
#### Development
```
./setup.sh site1-am password ./config.site.am.yaml ../../neo4j/RENCI-ad.graphml dev
```

### Environment and Configuration
The script `setup.sh` generates directory for the AM, which has `.env` file which contains Environment variables for `docker-compose.yml` to use
Expand Down
Loading