Apache Airavata is a software framework for executing and managing computational jobs on distributed computing resources including local clusters, supercomputers, national grids, academic and commercial clouds. Airavata builds on general concepts of service oriented computing, distributed messaging, and workflow composition and orchestration. Airavata bundles a server package with an API, client software development Kits and a general purpose reference UI implementation.
Key Features:
- π§ Service-oriented architecture with distributed messaging
- π Fully-managed task lifecycle (environment setup, data staging, execution, and output retrieval)
- βοΈ Multi-cloud and hybrid cloud support
- π₯οΈ Comprehensive API and SDK ecosystem
- π Reference UI Implementations
Apache Airavata is composed of modular components spanning core services, data management, user interfaces, and developer tooling.
airavata
β Main resource management and task orchestration middlewareairavata-custos
β Identity and access management frameworkairavata-mft
β Managed file transfer servicesairavata-portals
β All frontends for airavata
airavata-data-lake
β Data lake and storage backendairavata-data-catalog
β Metadata and search services
airavata-docs
β Developer documentationairavata-user-docs
β End-user guidesairavata-admin-user-docs
β Admin-focused documentationairavata-custos-docs
β Custos documentationairavata-site
β Project website
airavata-sandbox
β Prototypes and early-stage workairavata-labs
β Experimental projectsairavata-jupyter-kernel
β Jupyter integrationairavata-cerebrum
β Airavata for Neuroscience
org.apache.airavata.server.ServerMain
The API Server bootstraps 7 services - each implementing the org.apache.airavata.common.utils.IServer
interface - and provides the main entrypoint to Airavata.
- API - public-facing API consumed by Airavata SDKs and dashboards. It bridges external clients and internal services, and is served over Thrift.
(
org.apache.airavata.api.server.AiravataAPIServer
) - DB Event Manager - Monitors task execution events (launch, transitions, completion/failure) and syncs them to the Airavata DB via pub/sub hooks.
(
org.apache.airavata.db.event.manager.DBEventManagerRunner
) - Registry - Manages metadata and definitions for executable tasks and applications.
(
org.apache.airavata.registry.api.service.RegistryAPIServer
) - Credential Store - Manages secure storage and retrieval of credentials for accessing registered compute resources.
(
org.apache.airavata.credential.store.server.CredentialStoreServer
) - Sharing Registry - Handles sharing and permissioning of Airavata resources between users and groups.
(
org.apache.airavata.sharing.registry.server.SharingRegistryServer
) - Orchestrator - Constructs workflow DAGs, assigns unique IDs to tasks, and hands them off to the workflow manager.
(
org.apache.airavata.orchestrator.server.OrchestratorServer
) - Profile - Manages users, tenants, compute resources, and group profiles.
(
org.apache.airavata.service.profile.server.ProfileServiceServer
)
org.apache.airavata.helix.impl.workflow.PreWorkflowManager
The pre-workflow manager listens on the internal MQ (KafkaConsumer) to inbound tasks at pre-execution phase. When a task DAG is received, it handles the environment setup and data staging phases of the DAG in a robust manner, which includes fault-handling. All these happen BEFORE the task DAG is submitted to the controller, and subsequently to the participant.
org.apache.airavata.helix.impl.controller.HelixController
The Controller manages the step-by-step transition of task state on helix-side. It uses Apache Helix to track step start, completion, and failure paths, ensuring the next step starts upon successful completion or retrying the current step on failure.
org.apache.airavata.helix.impl.participant.GlobalParticipant
The participant synchronizes the helix-side state transition of a task with its concrete execution at airavata-side. The currently registered steps are: EnvSetupTask
, InputDataStagingTask
, OutputDataStagingTask
, JobVerificationTask
, CompletingTask
, ForkJobSubmissionTask
, DefaultJobSubmissionTask
, LocalJobSubmissionTask
, ArchiveTask
, WorkflowCancellationTask
, RemoteJobCancellationTask
, CancelCompletingTask
, DataParsingTask
, ParsingTriggeringTask
, and MockTask
.
org.apache.airavata.helix.impl.workflow.PostWorkflowManager
The post-workflow listens on the internal MQ (KafkaConsumer) to inbound tasks at post-execution phase. Once a task is received, it handles the cleanup and output fetching phases of the task DAG in a robust manner, which includes fault-handling. Once the main task completes executing, this is announced to the realtime monitor, upon which the post-workflow phase is triggered. Once triggered, it submits this state change to the controller.
org.apache.airavata.helix.impl.workflow.ParserWorkflowManager
The parser-workflow listens on the internal MQ (KafkaConsumer) to inbound tasks at post-completion phase., which includes transforming generated outputs into different formats. This component is not actively used in airavata.
Class Name:
org.apache.airavata.monitor.email.EmailBasedMonitor
The email monitor periodically checks an email inbox for job status updates sent via email. If it reads a new email with a job status update, it relays that state-change to the internal MQ (KafkaProducer).
Class Name:
org.apache.airavata.monitor.realtime.RealtimeMonitor
The realtime monitor listens to incoming state-change messages on the internal MQ (KafkaConsumer), and relays that state-change to the internal MQ (KafkaProducer). When a task is completed at the compute resource, the realtime monitor is notified of this.
Class Name:
org.apache.airavata.agent.connection.service.AgentServiceApplication
The agent service is the backend for launching interactive jobs using Airavata. It provide constructs to launch a custom "Agent" on a compute resource, that connects back to the Agent Service through a bi-directional gRPC channel. The Airavata Python SDK primarily utilizes the Agent Service (gRPC) and the Airavata API (Thrift) to submit and execute interactive jobs, spawn subprocesses, and create network tunnels to subprocesses, even if they are behind NAT.
Class Name:
org.apache.airavata.research.service.ResearchServiceApplication
The research service is the backend for the Airavata research catalog. It provides the API to add, list, modify, and publish notebooks, repositories, datasets, and computational models in cybershuttle, and launch interactive remote sessions to utilize them in a research setting.
Before building Apache Airavata, ensure you have:
Requirement | Version | Check Using |
---|---|---|
Java SDK | 17+ | java --version |
Apache Maven | 3.8+ | mvn -v |
Git | Latest | git -v |
- Clone the project repository
git clone git@github.com:apache/airavata.git
cd airavata
- Build the project
# with tests (slower, but safer)
mvn clean install
# OR without tests (faster)
mvn clean install -DskipTests
Once the build completes, the service bundles will be generated in the ./distributions
folder.
βββ airavata-sharing-registry-distribution-0.21-SNAPSHOT.tar.gz
βββ apache-airavata-agent-service-0.21-SNAPSHOT.tar.gz
βββ apache-airavata-api-server-0.21-SNAPSHOT.tar.gz
βββ apache-airavata-controller-0.21-SNAPSHOT.tar.gz
βββ apache-airavata-email-monitor-0.21-SNAPSHOT.tar.gz
βββ apache-airavata-file-server-0.21-SNAPSHOT.tar.gz
βββ apache-airavata-parser-wm-0.21-SNAPSHOT.tar.gz
βββ apache-airavata-participant-0.21-SNAPSHOT.tar.gz
βββ apache-airavata-post-wm-0.21-SNAPSHOT.tar.gz
βββ apache-airavata-pre-wm-0.21-SNAPSHOT.tar.gz
βββ apache-airavata-realtime-monitor-0.21-SNAPSHOT.tar.gz
βββ apache-airavata-research-service-0.21-SNAPSHOT.tar.gz
1 directory, 12 files
β οΈ Note: Docker deployment is experimental and not recommended for production use.
Prerequisites:
- Docker Engine 20.10+
- Docker Compose 2.0+
Build and Deploy:
# 1. Build source and Docker images
git clone https://github.com/apache/airavata.git
cd airavata
mvn clean install
mvn docker:build -pl modules/distribution
# 2. Start all services
##Start all supporting services and Airavata microservices (API Server, Helix components, and Job Monitors)
docker-compose \
-f modules/distribution/src/main/docker/docker-compose.yml \
up -d
# 3. Verify services are running
docker-compose ps
Service Endpoints:
- API Server:
airavata.host:8960
- Profile Service:
airavata.host:8962
- Keycloak:
airavata.host:8443
Host Configuration:
Add to your /etc/hosts
file:
127.0.0.1 airavata.host
Stop Services:
docker-compose \
-f modules/ide-integration/src/main/containers/docker-compose.yml \
-f modules/distribution/src/main/docker/docker-compose.yml \
down
The easiest way to get started with running Airavata locally and setting up a development environment is to follow the instructions in the ide-integration README.Those instructions will guide you on setting up a development environment with IntelliJ IDEA.
We welcome contributions from the community! Here's how you can help:
- π΄ Fork the repository
- πΏ Create a feature branch
- β¨ Make your changes
- π§ͺ Add tests if applicable
- π Submit a pull request
Learn More:
Get Help:
- π§ User Mailing List: users@airavata.apache.org
- π¨βπ» Developer Mailing List: dev@airavata.apache.org
- π All Mailing Lists: airavata.apache.org/mailing-list
Licensed to the Apache Software Foundation (ASF) under one or more contributor
license agreements. See the NOTICE file distributed with this work for
additional information regarding copyright ownership.
The ASF licenses this file to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance with the License.
You may obtain a copy of the License at:
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed
under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
See the LICENSE file for complete license details.