From dd63de0964d02e247754c0e9fac896c00f77c7b8 Mon Sep 17 00:00:00 2001 From: Jason Andrews Date: Fri, 11 Jul 2025 22:04:25 +0100 Subject: [PATCH] first review of 2 instance OpenAD kit Learning Path --- .../1_functional_safety.md | 113 ++++++++---------- .../2_data_distribution_service.md | 55 ++++----- .../3_container_spliting.md | 62 +++++----- .../4_multiinstance_executing.md | 14 +-- .../openadkit2_safetyisolation/_index.md | 35 +++++- 5 files changed, 148 insertions(+), 131 deletions(-) diff --git a/content/learning-paths/automotive/openadkit2_safetyisolation/1_functional_safety.md b/content/learning-paths/automotive/openadkit2_safetyisolation/1_functional_safety.md index 026b729ba8..d08f796580 100644 --- a/content/learning-paths/automotive/openadkit2_safetyisolation/1_functional_safety.md +++ b/content/learning-paths/automotive/openadkit2_safetyisolation/1_functional_safety.md @@ -8,67 +8,61 @@ layout: learningpathall ## Why Functional Safety Matters in Automotive Software -[Functional Safety](https://en.wikipedia.org/wiki/Functional_safety) refers to a system's ability to detect potential faults and respond appropriately to ensure that the system remains in a safe state, preventing harm to individuals or damage to equipment. +Functional Safety refers to a system's ability to detect potential faults and respond appropriately to ensure that the system remains in a safe state, preventing harm to individuals or damage to equipment. -This is particularly important in **automotive, autonomous driving, medical devices, industrial control, robotics and aerospace** applications, where system failures can lead to severe consequences. +This is particularly important in automotive, autonomous driving, medical devices, industrial control, robotics and aerospace applications, where system failures can lead to severe consequences. -In software development, Functional Safety focuses on minimizing risks through **software design, testing, and validation** to ensure that critical systems operate in a predictable, reliable, and verifiable manner. This means developers must consider: -- **Error detection mechanisms** -- **Exception handling** -- **Redundancy design** -- **Development processes compliant with safety standards** +In software development, Functional Safety focuses on minimizing risks through software design, testing, and validation to ensure that critical systems operate in a predictable, reliable, and verifiable manner. This means developers must consider: +- Error detection mechanisms +- Exception handling +- Redundancy design +- Development processes compliant with safety standards ### Definition and Importance of Functional Safety -The core of Functional Safety lies in **risk management**, which aims to reduce the impact of system failures. +The core of Functional Safety lies in risk management, which aims to reduce the impact of system failures. -In autonomous vehicles, Functional Safety ensures that if sensor data is incorrect, the system can enter a **safe state**, preventing incorrect driving decisions. +In autonomous vehicles, Functional Safety ensures that if sensor data is incorrect, the system can enter a safe state, preventing incorrect driving decisions. The three core objectives of Functional Safety are: -1. **Prevention** - - Reducing the likelihood of errors through rigorous software development processes and testing. In the electric vehicle, the battery systems monitor temperature to prevent overheating. -2. **Detection** - - Quickly identifying errors using built-in diagnostic mechanisms (e.g., Built-in Self-Test, BIST). -3. **Mitigation** - - Controlling the impact of failures to ensure the overall safety of the system. +1. Prevention: Reducing the likelihood of errors through rigorous software development processes and testing. In the electric vehicle, the battery systems monitor temperature to prevent overheating. +2. Detection: Quickly identifying errors using built-in diagnostic mechanisms, such as built-in self-test. +3. Mitigation: Controlling the impact of failures to ensure the overall safety of the system. -This approach is critical in applications such as **autonomous driving, flight control, and medical implants**, where failures can result in **severe consequences**. +This approach is critical in applications such as autonomous driving, flight control, and medical implants, where failures can result in severe consequences. ### ISO 26262: Automotive Functional Safety Standard -[ISO 26262](https://www.iso.org/standard/68383.html) is a functional safety standard specifically for **automotive electronics and software systems**. It defines a comprehensive [V-model](https://en.wikipedia.org/wiki/V-model) aligned safety lifecycle, covering all phases from **requirement analysis, design, development, testing, to maintenance**. +ISO 26262 is a functional safety standard specifically for automotive electronics and software systems. It defines a comprehensive V-model aligned safety lifecycle, covering all phases from requirement analysis, design, development, testing, to maintenance. Key Concepts of ISO 26262: -- **ASIL (Automotive Safety Integrity Level)** - - Evaluates the risk level of different system components (A, B, C, D, where **D represents the highest safety requirement**). +- ASIL (Automotive Safety Integrity Level) + - Evaluates the risk level of different system components (A, B, C, D, where D represents the highest safety requirement). - For example: ASIL A can be Dashboard light failure (low risk) and ASIL D is Brake system failure (high risk). - https://en.wikipedia.org/wiki/Automotive_Safety_Integrity_Level -- **HARA (Hazard Analysis and Risk Assessment)** +- HARA (Hazard Analysis and Risk Assessment) - Analyzes hazards and assesses risks to determine necessary safety measures. -- **Safety Mechanisms** +- Safety Mechanisms - Includes real-time error detection, system-level fault tolerance, and defined fail-safe or fail-operational fallback states. Typical Application Scenarios: -- **Autonomous Driving Systems**: +- Autonomous Driving Systems: - Ensures that even if sensors (e.g., LiDAR, radar, cameras) provide faulty data, the vehicle will not make dangerous decisions. -- **Powertrain Control**: +- Powertrain Control: - Prevents braking system failures that could lead to loss of control. -- **Battery Management System (BMS)**: +- Battery Management System (BMS): - Prevents battery overheating or excessive discharge in electric vehicles. -For more details, you can check this video: [What is Functional Safety?](https://www.youtube.com/watch?v=R0CPzfYHdpQ) - - ### Common Use Cases of Functional Safety in Automotive -- **Autonomous Driving**: + +- Autonomous Driving: - Ensures the vehicle can operate safely or enter a fail-safe state when sensors like LiDAR, radar, or cameras malfunction. - Functional Safety enables real-time fault detection and fallback logic to prevent unsafe driving decisions. -- **Powertrain Control**: +- Powertrain Control: - Monitors throttle and brake signals to prevent unintended acceleration or braking loss. - Includes redundancy, plausibility checks, and emergency overrides to maintain control under failure conditions. -- **Battery Management Systems (BMS)**: +- Battery Management Systems (BMS): - Protects EV batteries from overheating, overcharging, or deep discharge. - Safety functions include temperature monitoring, voltage balancing, and relay cut-off mechanisms to prevent thermal runaway. @@ -77,7 +71,7 @@ A widely adopted approach in modern automotive platforms is the Safety Island— ### Safety Island: Enabling Functional Safety in Autonomous Systems -In automotive systems, a **General ECU (Electronic Control Unit)** typically runs non-critical tasks such as infotainment or navigation, whereas a **Safety Island** is dedicated to executing safety-critical control logic (e.g., braking, steering) with strong isolation, redundancy, and determinism. +In automotive systems, a General ECU (Electronic Control Unit) typically runs non-critical tasks such as infotainment or navigation, whereas a Safety Island is dedicated to executing safety-critical control logic (e.g., braking, steering) with strong isolation, redundancy, and determinism. The table below compares the characteristics of a General ECU and a Safety Island in terms of their role in supporting Functional Safety. @@ -91,53 +85,50 @@ The table below compares the characteristics of a General ECU and a Safety Islan This contrast highlights why safety-focused software needs a dedicated hardware domain with certified execution behavior. -**Safety Island** is an independent safety subsystem separate from the main processor. It is responsible for monitoring and managing system safety. If the main processor fails or becomes inoperable, Safety Island can take over critical safety functions such as **deceleration, stopping, and fault handling** to prevent catastrophic system failures. +Safety Island is an independent safety subsystem separate from the main processor. It is responsible for monitoring and managing system safety. If the main processor fails or becomes inoperable, Safety Island can take over critical safety functions such as deceleration, stopping, and fault handling to prevent catastrophic system failures. Key Capabilities of Safety Island -- **System Health Monitoring** +- System Health Monitoring - Continuously monitors the operational status of the main processor (e.g., ADAS control unit, ECU) and detects potential errors or anomalies. -- **Fault Detection and Isolation** +- Fault Detection and Isolation - Independently evaluates and initiates emergency handling if the main processing unit encounters errors, overheating, computational failures, or unresponsiveness. -- **Providing Essential Safety Functions** +- Providing Essential Safety Functions - Even if the main system crashes, Safety Island can still execute minimal safety operations, such as: - Autonomous Vehicles → Safe stopping (Fail-Safe Mode) - Industrial Equipment → Emergency power cutoff or speed reduction - ### Why Safety Island Matters for Functional Safety Safety Island plays a critical role in Functional Safety by ensuring that the system can handle high-risk scenarios and minimize catastrophic failures. How Safety Island Enhances Functional Safety -1. **Acts as an Independent Redundant Safety Layer** +1. Acts as an Independent Redundant Safety Layer - Even if the main system fails, it can still operate independently. -2. **Supports ASIL-D Safety Level** - - Monitors ECU health status and executes emergency safety strategies (e.g., emergency braking). -3. **Provides Independent Fault Detection and Recovery Mechanisms** - - **Fail-Safe**: Activates a **safe mode**, such as limiting vehicle speed or switching to manual control. - - **Fail-Operational**: Ensures that high-safety applications (e.g., aerospace systems) can continue operating under certain conditions. - -For more insights on **Arm's Functional Safety solutions**, you can refer to: [Arm Functional Safety Compute Blog](https://community.arm.com/arm-community-blogs/b/automotive-blog/posts/functional-safety-compute) - +2. Supports ASIL-D Safety Level + - Monitors ECU health status and executes emergency safety strategies, such as emergency braking. +3. Provides Independent Fault Detection and Recovery Mechanisms + - Fail-Safe: Activates a safe mode, such as limiting vehicle speed or switching to manual control. + - Fail-Operational: Ensures that high-safety applications, such as aerospace systems, can continue operating under certain conditions. ### Functional Safety in the Software Development Lifecycle -Functional Safety impacts **both hardware and software development**, particularly in areas such as requirement changes, version management, and testing validation. +Functional Safety impacts both hardware and software development, particularly in areas such as requirement changes, version management, and testing validation. For example, in ASIL-D level applications, every code modification requires a complete impact analysis and regression testing to ensure that new changes do not introduce additional risks. ### Functional Safety Requirements in Software Development + These practices ensure the software development process meets industry safety standards and can withstand system-level failures: -- **Requirement Specification** - - Clearly defining **safety-critical requirements** and conducting risk assessments. -- **Safety-Oriented Programming** - - Following **MISRA C, CERT C/C++ standards** and using static analysis tools to detect errors. -- **Fault Handling Mechanisms** - - Implementing **redundancy design and health monitoring** to handle anomalies. -- **Testing and Verification** - - Using **Hardware-in-the-Loop (HIL)** testing to ensure software safety in real hardware environments. -- **Version Management and Change Control** - - Using **Git, JIRA, Polarion** to track changes for safety audits. - -This learning path builds upon the previous containerized [learning path](https://learn.arm.com/learning-paths/automotive/openadkit1_container) guide and introduces Functional Safety design practices from the earliest development stages. - -By establishing an ASIL Partitioning software development environment and leveraging [**SOAFEE**](https://www.soafee.io/) technologies, developers can enhance software consistency and maintainability in Functional Safety applications. +- Requirement Specification + - Clearly defining safety-critical requirements and conducting risk assessments. +- Safety-Oriented Programming + - Following MISRA C, CERT C/C++ standards and using static analysis tools to detect errors. +- Fault Handling Mechanisms + - Implementing redundancy design and health monitoring to handle anomalies. +- Testing and Verification + - Using Hardware-in-the-Loop (HIL) testing to ensure software safety in real hardware environments. +- Version Management and Change Control + - Using Git, JIRA, Polarion to track changes for safety audits. + +By establishing an ASIL Partitioning software development environment and leveraging SOAFEE technologies, you can enhance software consistency and maintainability in Functional Safety applications. + +This Learning Path follows [Deploy Open AD Kit containerized autonomous driving simulation on Arm Neoverse](/learning-paths/automotive/openadkit1_container/) and introduces Functional Safety design practices from the earliest development stages. \ No newline at end of file diff --git a/content/learning-paths/automotive/openadkit2_safetyisolation/2_data_distribution_service.md b/content/learning-paths/automotive/openadkit2_safetyisolation/2_data_distribution_service.md index 64aab3cab5..19a36ba606 100644 --- a/content/learning-paths/automotive/openadkit2_safetyisolation/2_data_distribution_service.md +++ b/content/learning-paths/automotive/openadkit2_safetyisolation/2_data_distribution_service.md @@ -8,52 +8,53 @@ layout: learningpathall ### Introduction to DDS Data Distribution Service (DDS) is a real-time, high-performance middleware designed for distributed systems. -It is particularly valuable in automotive software development, including applications such as **autonomous driving (AD)** and **advanced driver assistance systems (ADAS)**. +It is particularly valuable in automotive software development, including applications such as autonomous driving (AD) and advanced driver assistance systems (ADAS). -DDS offers a decentralized architecture that enables scalable, low-latency, and reliable data exchange—making it ideal for managing high-frequency sensor streams. +DDS offers a decentralized architecture that enables scalable, low-latency, and reliable data exchange, making it ideal for managing high-frequency sensor streams. -In modern vehicles, multiple sensors (LiDAR, radar, cameras) must continuously communicate with compute modules. - -DDS ensures these components share data seamlessly and in real time, both within the vehicle and across infrastructure (e.g., V2X systems like traffic lights and road sensors). +In modern vehicles, multiple sensors such as LiDAR, radar, and cameras must continuously communicate with compute modules. +DDS ensures these components share data seamlessly and in real time, both within the vehicle and across infrastructure such as V2X systems, including traffic lights and road sensors. ### Why Automotive Software Needs DDS -Next-generation automotive software architectures —like [SOAFEE](https://www.soafee.io/)- depend on deterministic, distributed communication. Traditional client-server models introduce latency and single points of failure, while DDS’s publish-subscribe model enables direct, peer-to-peer communication across system components. +Next-generation automotive software architectures, such as SOAFEE, depend on deterministic, distributed communication. Traditional client-server models introduce latency and create single points of failure. In contrast, DDS uses a publish-subscribe model that enables direct, peer-to-peer communication across system components. -For example, a LiDAR sensor broadcasting obstacle data can simultaneously deliver updates to perception, SLAM, and motion planning modules—without redundant network traffic or central coordination. +For example, a LiDAR sensor broadcasting obstacle data can simultaneously deliver updates to perception, SLAM, and motion planning modules. This approach avoids redundant network traffic and does not require central coordination. -Additionally, DDS provides a flexible Quality of Service (QoS) configuration, allowing engineers to fine-tune communication parameters based on system requirements. Low-latency modes are ideal for real-time decision-making in vehicle control, while high-reliability configurations ensure data integrity in safety-critical applications like V2X communication. +Additionally, DDS provides a flexible Quality of Service (QoS) configuration, allowing engineers to fine-tune communication parameters based on system requirements. Low-latency modes are ideal for real-time decision-making in vehicle control, while high-reliability configurations ensure data integrity in safety-critical applications such as V2X communication. These capabilities make DDS an essential backbone for autonomous vehicle stacks, where real-time sensor fusion and control coordination are critical for safety and performance. ### DDS Architecture and Operation -DDS uses a **data-centric publish-subscribe (DCPS)** model, allowing producers and consumers of data to communicate without direct dependencies. This modular approach enhances system flexibility and maintainability, making it well-suited for complex automotive environments. +DDS uses a data-centric publish-subscribe (DCPS) model, allowing producers and consumers of data to communicate without direct dependencies. This modular approach enhances system flexibility and maintainability, making it well suited for complex automotive environments. + +DDS organizes communication within domains, which act as isolated scopes. Inside each domain, the following elements are used: +- Topics represent named data streams, such as /vehicle/speed or /perception/objects. +- DataWriters (publishers) send data to topics. +- DataReaders (subscribers) receive data from topics. -DDS organizes communication within **domains**, which act as isolated scopes. Inside each domain: -- ***Topics*** represent named data streams (e.g., /vehicle/speed, /perception/objects) -- ***DataWriters*** (publishers) send data to topics -- ***DataReaders*** (subscribers) receive data from topics This structure enables concurrent, decoupled communication between multiple modules without hardcoding communication links. -Each domain contains multiple **topics**, representing specific data types such as vehicle speed, obstacle detection, or sensor fusion results. **Publishers** use **DataWriters** to send data to these topics, while **subscribers** use **DataReaders** to receive the data. This architecture supports concurrent data processing, ensuring that multiple modules can work with the same data stream simultaneously. +Each domain contains multiple topics, representing specific data types such as vehicle speed, obstacle detection, or sensor fusion results. Publishers use DataWriters to send data to these topics, while subscribers use DataReaders to receive the data. This architecture supports concurrent data processing, ensuring that multiple modules can work with the same data stream simultaneously. For example, in an autonomous vehicle, LiDAR, radar, and cameras continuously generate large amounts of sensor data. The perception module subscribes to these sensor topics, processes the data, and then publishes detected objects and road conditions to other components like path planning and motion control. Since DDS automatically handles participant discovery and message distribution, engineers do not need to manually configure communication paths, reducing development complexity. - ### Real-World Use in Autonomous Driving + DDS is widely used in autonomous driving systems, where real-time data exchange is crucial. A typical use case involves high-frequency sensor data transmission and decision-making coordination between vehicle subsystems. For instance, a LiDAR sensor generates millions of data points per second, which need to be shared with multiple modules. DDS allows this data to be published once and received by multiple subscribers, including perception, localization, and mapping components. After processing, the detected objects and road features are forwarded to the path planning module, which calculates the vehicle's next movement. Finally, control commands are sent to the vehicle actuators, ensuring precise execution. This real-time data flow must occur within milliseconds to enable safe autonomous driving. DDS ensures minimal transmission delay, enabling rapid response to dynamic road conditions. In emergency scenarios, such as detecting a pedestrian or sudden braking by a nearby vehicle, DDS facilitates instant data propagation, allowing the system to take immediate corrective action. -For example: [Autoware](https://www.autoware.org/)—an open-source autonomous driving software stack—uses DDS to handle high-throughput communication across its modules. +For example, Autoware, an open-source autonomous driving software stack, uses DDS to handle high-throughput communication across its modules. -The **Perception** stack publishes detected objects from LiDAR and camera sensors to a shared topic, which is then consumed by the **Planning** module in real-time. Using DDS allows each subsystem to scale independently while preserving low-latency and deterministic communication. +The Perception stack publishes detected objects from LiDAR and camera sensors to a shared topic, which is then consumed by the Planning module in real-time. Using DDS allows each subsystem to scale independently while preserving low-latency and deterministic communication. ### Publish-Subscribe Model and Data Transmission + Let’s explore how DDS’s publish-subscribe model fundamentally differs from traditional communication methods in terms of scalability, latency, and reliability. Traditional client-server communication requires a centralized server to manage data exchange. This architecture introduces several drawbacks, including increased latency and network congestion, which can be problematic in real-time automotive applications. @@ -63,21 +64,21 @@ DDS adopts a publish-subscribe model, enabling direct communication between syst For example, in an automotive perception system, LiDAR, radar, and cameras continuously publish sensor data. Multiple subscribers, including object detection, lane recognition, and obstacle avoidance modules, can access this data simultaneously without additional network overhead. DDS automatically manages message distribution, ensuring efficient resource utilization. DDS supports multiple transport mechanisms to optimize communication efficiency: -- **Shared memory transport**: Ideal for ultra-low-latency communication within an ECU, minimizing processing overhead. -- **UDP or TCP/IP**: Used for inter-device communication, such as V2X applications where vehicles exchange safety-critical messages. -- **Automatic participant discovery**: Eliminates the need for manual configuration, allowing DDS nodes to detect and establish connections dynamically. +* Shared memory transport is ideal for ultra-low-latency communication within an ECU, minimizing processing overhead. +* UDP or TCP/IP is used for inter-device communication, such as V2X applications where vehicles exchange safety-critical messages. +* Automatic participant discovery eliminates the need for manual configuration, allowing DDS nodes to detect and establish connections dynamically. #### Comparison of DDS and Traditional Communication Methods The following table highlights how DDS improves upon traditional client-server communication patterns in the context of real-time automotive applications: -| **Feature** | **Traditional Client-Server Architecture** | **DDS Publish-Subscribe Model** | -|-----------------------|--------------------------------------------|--------------------------- | -| **Data Transmission** | Relies on a central server | Direct peer-to-peer communication | -| **Latency** | Higher latency | Low latency | -| **Scalability** | Limited by server capacity | Suitable for large-scale systems | -| **Reliability** | Server failure affects the whole system | No single point of failure | -| **Use Cases** | Small-scale applications | V2X, autonomous driving | +| Feature | Traditional Client-Server Architecture | DDS Publish-Subscribe Model | +|----------------------|--------------------------------------------|--------------------------- | +| Data Transmission | Relies on a central server | Direct peer-to-peer communication | +| Latency | Higher latency | Low latency | +| Scalability | Limited by server capacity | Suitable for large-scale systems | +| Reliability | Server failure affects the whole system | No single point of failure | +| Use Cases | Small-scale applications | V2X, autonomous driving | These features make DDS a highly adaptable solution for automotive software engineers seeking to develop scalable, real-time communication frameworks. diff --git a/content/learning-paths/automotive/openadkit2_safetyisolation/3_container_spliting.md b/content/learning-paths/automotive/openadkit2_safetyisolation/3_container_spliting.md index 52a0b56131..3366b68d39 100644 --- a/content/learning-paths/automotive/openadkit2_safetyisolation/3_container_spliting.md +++ b/content/learning-paths/automotive/openadkit2_safetyisolation/3_container_spliting.md @@ -8,52 +8,54 @@ layout: learningpathall ### System Architecture and Component Design -Now that you’ve explored the concept of a Safety Island -- a dedicated subsystem responsible for executing safety-critical control logic—and learned how DDS (Data Distribution Service) enables real-time, distributed communication, you’ll refactor the original OpenAD Kit architecture into a multi-instance deployment. +Now that you’ve explored the concept of a Safety Island, a dedicated subsystem responsible for executing safety-critical control logic, and learned how DDS (Data Distribution Service) enables real-time, distributed communication, you’ll refactor the original OpenAD Kit architecture into a multi-instance deployment. -In the [previous learning path](http://learn.arm.com/learning-paths/automotive/openadkit1_container/), OpenAD Kit deployed three container components on a single Arm-based instance, handling: -- ***Simulation environment*** -- ***Visualization*** -- ***Planning-Control*** +In [Deploy Open AD Kit containerized autonomous driving simulation on Arm Neoverse](http://learn.arm.com/learning-paths/automotive/openadkit1_container/), you deployed three container components on a single Arm-based instance, handling: +- Simulation environment +- Visualization +- Planning-Control In this session, you will split the simulation and visualization stack from the planning-control logic and deploy them across two independent Arm-based instances. These nodes communicate using ROS 2 with DDS as the middleware layer, ensuring low-latency and fault-tolerant data exchange between components. ### Architectural Benefits + This architecture brings several practical benefits: -- ***Enhanced System Stability:*** +- Enhanced System Stability: Decoupling components prevents resource contention and ensures that safety-critical functions remain deterministic and responsive. -- ***Real-Time, Scalable Communication:*** +- Real-Time, Scalable Communication: DDS enables built-in peer discovery and configurable QoS, removing the need for a central broker or manual network setup. -- ***Improved Scalability and Performance Tuning:*** -Each instance can be tuned based on its workload—e.g., simulation tasks can use GPU-heavy hardware, while planning logic may benefit from CPU-optimized setups. +- Improved Scalability and Performance Tuning: +Each instance can be tuned based on its workload—for example, simulation tasks can use GPU-heavy hardware, while planning logic may benefit from CPU-optimized setups. -- ***Support for Modular CI/CD Workflows:*** +- Support for Modular CI/CD Workflows: With containerized separation, you can build, test, and deploy each module independently—enabling agile development and faster iteration cycles. ![img1 alt-text#center](aws_example.jpg "Figure 1: Split instance example in AWS") - ### Networking Setting -To begin, launch two Arm-based instances—either as cloud VMs (e.g., AWS EC2) or on-premise Arm servers. +To begin, launch two Arm-based VM instances. AWS EC2 is used, but you can use any Arm instances. + These instances will independently host your simulation and control workloads. {{% notice Note %}} -The specifications of the two Arm instances don’t need to be identical. In my tests, 16 CPUs and 32GB of RAM have already provided good performance. +The specifications of the two Arm instances don’t need to be identical. For testing, 16 CPUs and 32GB of RAM show good performance. {{% /notice %}} After provisioning the machines, determine where you want the `Planning-Control` container to run. The other instance will host the `Simulation Environment` and `Visualization` components. To enable ROS 2 and DDS communication between the two nodes, configure network access accordingly. -If you are using AWS EC2, both instances should be assigned to the same ***Security Group***. + +If you are using AWS EC2, both instances should be assigned to the same Security Group. Within the EC2 Security Group settings: -- Add an Inbound Rule that allows all traffic from the same Security Group (i.e., set the source to the group itself). +- Add an inbound rule that allows all traffic from the same Security Group by setting he source to the security group itself. - Outbound traffic is typically allowed by default and usually does not require changes. ![img2 alt-text#center](security_group.jpg "Figure 2: AWS Security Group Setting") @@ -64,16 +66,14 @@ Once both systems are operational, record the private IP addresses of each insta ### New Docker YAML Configuration Setting -Before you begin, ensure that Docker is installed on both of your development instances. -You will also need to clone the demo repository used in the previous learning path. +Before you begin, ensure that Docker is installed on both of your development instances. Review the [Docker install guide](/install-guides/docker/docker-engine/) if needed. -First, you need clone the demo repo and create xml file called `cycloneDDS.xml` +First, clone the demo repo and create xml file called `cycloneDDS.xml` #### Step 1: Clone the repository and prepare configuration files ```bash git clone https://github.com/odincodeshen/openadkit_demo.autoware.git - cd openadkit_demo.autoware cp docker/docker-compose.yml docker/docker-compose-2ins.yml touch docker/cycloneDDS.xml @@ -96,15 +96,15 @@ Each image is around 4–6 GB, so pulling them may vary depending on your networ {{% /notice %}} This command will download all images defined in the docker-compose-2ins.yml file, including: -- ***odinlmshen/autoware-simulator:v1.0*** -- ***odinlmshen/autoware-planning-control:v1.0*** -- ***odinlmshen/autoware-visualizer:v1.0*** +- odinlmshen/autoware-simulator:v1.0 +- odinlmshen/autoware-planning-control:v1.0 +- odinlmshen/autoware-visualizer:v1.0 #### Step 2: Configure CycloneDDS for Peer-to-Peer Communication The cycloneDDS.xml file is used to customize how CycloneDDS (the middleware used by ROS 2) discovers and communicates between distributed nodes. -Please copy the following configuration into docker/cycloneDDS.xml on both machines, and replace the IP addresses with the private IPs of each EC2 instance (e.g., 192.168.xx.yy and 192.168.aa.bb): +Please copy the following configuration into docker/cycloneDDS.xml on both machines, and replace the IP addresses with the private IPs of each EC2 instance. ```xml @@ -137,12 +137,11 @@ Please copy the following configuration into docker/cycloneDDS.xml on both machi ``` {{% notice Note %}} -1. Make sure the network interface name (ens5) matches the one on your EC2 instances. You can verify this using ip -br a. +1. Make sure the network interface name (ens5) matches the one on your EC2 instances. You can verify this using `ip -br a`. 2. This configuration disables multicast and enables static peer discovery between the two machines using unicast. 3. You can find the more detail about CycloneDDS setting [Configuration](https://cyclonedds.io/docs/cyclonedds/latest/config/config_file_reference.html#cyclonedds-domain-internal-socketreceivebuffersize) {{% /notice %}} - #### Step 3: Update the Docker Compose Configuration for Multi-Host Deployment To support running containers across two separate hosts, you’ll need to modify the docker/docker-compose-2ins.yml file. @@ -160,6 +159,7 @@ Since the planning-control and simulator containers will now run on different ma ``` ##### Enable Host Networking + All three containers (visualizer, simulator, planning-control) need access to the host’s network interfaces for DDS-based peer discovery. Replace Docker's default bridge network with host networking: @@ -182,6 +182,7 @@ To ensure that each container uses your custom DDS configuration, mount the curr Add this to every container definition to ensure consistent behavior across the deployment. Here is the complete XML file: + ```YAML services: simulator: @@ -266,9 +267,9 @@ sudo sysctl -w net.core.rmem_max=2147483647 ``` Explanation of Parameters -- ***net.ipv4.ipfrag_time=3***: Reduces the timeout for holding incomplete IP fragments, helping free up memory more quickly. -- ***net.ipv4.ipfrag_high_thresh=134217728***: Increases the memory threshold for IP fragment buffers to 128 MB, preventing early drops under high load. -- ***net.core.rmem_max=2147483647***: Expands the maximum socket receive buffer size to support high-throughput DDS traffic. +- `net.ipv4.ipfrag_time=3`: Reduces the timeout for holding incomplete IP fragments, helping free up memory more quickly. +- `net.ipv4.ipfrag_high_thresh=134217728`: Increases the memory threshold for IP fragment buffers to 128 MB, preventing early drops under high load. +- `net.core.rmem_max=2147483647`: Expands the maximum socket receive buffer size to support high-throughput DDS traffic. To ensure these settings persist after reboot, create a configuration file under /etc/sysctl.d/: @@ -285,8 +286,7 @@ Then apply the configuration system-wide: sudo sysctl --system ``` - -Reference: +Links to documentation: - [Autoware dds-setting](https://autowarefoundation.github.io/autoware-documentation/main/installation/additional-settings-for-developers/network-configuration/dds-settings/) - [ROS2 documentation](https://docs.ros.org/en/humble/How-To-Guides/DDS-tuning.html#cyclone-dds-tuning) @@ -369,4 +369,4 @@ This confirms that: - ROS 2 nodes are able to communicate across EC2 instances via /hello topic. - The network settings including host mode, security group, and CycloneDDS peer configuration are correctly applied. -In the next section, you’ll complete the full end-to-end demonstration with all of the concepts. +In the next section, you’ll complete the full end-to-end demonstration. diff --git a/content/learning-paths/automotive/openadkit2_safetyisolation/4_multiinstance_executing.md b/content/learning-paths/automotive/openadkit2_safetyisolation/4_multiinstance_executing.md index 6cca5ad53f..fbdd7c7c06 100644 --- a/content/learning-paths/automotive/openadkit2_safetyisolation/4_multiinstance_executing.md +++ b/content/learning-paths/automotive/openadkit2_safetyisolation/4_multiinstance_executing.md @@ -9,10 +9,10 @@ layout: learningpathall ### Demonstrating the Distributed OpenAD Kit in Action -In this session, you’ll bring all the previous setup together and execute the full [OpenAD Kit](https://autoware.org/open-ad-kit/) demo across two Arm-based instances. +In this section, you’ll bring all the previous setup together and execute the full OpenAD Kit demo across two Arm-based instances. OpenAD Kit is an open-source reference design for autonomous driving workloads on Arm. -It demonstrates how Autoware modules can be deployed on scalable infrastructure — whether on a single machine or split across multiple compute nodes. +It demonstrates how Autoware modules can be deployed on scalable infrastructure, whether on a single machine or split across multiple compute nodes. #### Preparing the Execution Scripts @@ -20,7 +20,7 @@ This setup separates the simulation/visualization environment from the planning- To start the system, you need to configure and run separate launch commands on each machine. -On each instance, copy the appropriate launch script into the openadkit_demo.autoware/docker directory. +On each instance, copy the appropriate launch script into the `openadkit_demo.autoware/docker` directory. {{< tabpane code=true >}} {{< tab header="Planning-Control" language="bash">}} @@ -67,7 +67,7 @@ On each instance, copy the appropriate launch script into the openadkit_demo.aut {{< /tab >}} {{< /tabpane >}} -You can also find the prepared launch scripts—`opad_planning.sh` and `opad_sim_vis.sh` —inside the `openadkit_demo.autoware/docker` directory on both instances. +You can also find the prepared launch scripts `opad_planning.sh` and `opad_sim_vis.sh` inside the `openadkit_demo.autoware/docker` directory on both instances. These scripts encapsulate the required environment variables and container commands for each role. @@ -86,13 +86,13 @@ On the Simulation and Visualization node, execute: ``` Once both machines are running their respective launch scripts, the Visualizer will generate a web-accessible interface using the machine’s public IP address. -You can open this link in a browser to observe the demo behavior, which will closely resemble the output from the [previous learning path](http://learn.arm.com/learning-paths/automotive/openadkit1_container/4_run_openadkit/). +You can open this link in a browser to observe the demo behavior. ![img3 alt-text#center](split_aws_run.gif "Figure 4: Simulation") -Unlike the previous setup, the containers are now distributed across two separate instances, enabling real-time, cross-node communication. +The containers are now distributed across two separate instances, enabling real-time, cross-node communication. Behind the scenes, this architecture demonstrates how DDS manages low-latency, peer-to-peer data exchange in a distributed ROS 2 environment. -To support demonstration and validation, the simulator is configured to run `three times` sequentially, giving you multiple opportunities to observe how data flows between nodes and verify that communication remains stable across each cycle. +To support demonstration and validation, the simulator is configured to run three times sequentially, giving you multiple opportunities to observe how data flows between nodes and verify that communication remains stable across each cycle. Now that you’ve seen the distributed system in action, consider exploring different QoS settings, network conditions, or even adding a third node to expand the architecture further. diff --git a/content/learning-paths/automotive/openadkit2_safetyisolation/_index.md b/content/learning-paths/automotive/openadkit2_safetyisolation/_index.md index 073dc79f51..4319e4824a 100644 --- a/content/learning-paths/automotive/openadkit2_safetyisolation/_index.md +++ b/content/learning-paths/automotive/openadkit2_safetyisolation/_index.md @@ -7,7 +7,7 @@ cascade: minutes_to_complete: 60 -who_is_this_for: This Learning Path targets advanced automotive software engineers developing safety-critical systems. It demonstrates how to use Arm Neoverse cloud infrastructure to accelerate ISO-26262-compliant software prototyping and testing workflows. +who_is_this_for: This Learning Path targets advanced automotive software engineers developing safety-critical systems. It demonstrates how to use Arm Neoverse cloud infrastructure to accelerate ISO-26262 compliant software prototyping and testing workflows. learning_objectives: - Learn the Functional Safety principles—including risk prevention, fault detection, and ASIL compliance—to design robust and certifiable automotive software systems. @@ -16,8 +16,8 @@ learning_objectives: prerequisites: - Two Arm-based Neoverse cloud instances or a local Arm Neoverse Linux computer with at least 16 CPUs and 32GB of RAM. - - Completion of the previous learning path. http://learn.arm.com/learning-paths/automotive/openadkit1_container/ - - Basic knowledge of Docker operations. + - To have completed [Deploy Open AD Kit containerized autonomous driving simulation on Arm Neoverse](/learning-paths/automotive/openadkit1_container/). + - Basic knowledge of using Docker. author: - Odin Shen @@ -37,13 +37,38 @@ operatingsystems: further_reading: - resource: - title: eclipse-zenoh github - link: https://learn.arm.com/learning-paths/automotive/openadkit1_container/ + title: Functional Safety compute for the Software-defined Vehicle + link: https://community.arm.com/arm-community-blogs/b/automotive-blog/posts/functional-safety-compute + type: blog + - resource: + title: SOAFEE + link: https://www.soafee.io/ + type: website + - resource: + title: V-model + link: https://en.wikipedia.org/wiki/V-model + type: documentation + - resource: + title: ISO 26262 + link: https://www.iso.org/standard/68383.html + type: documentation + - resource: + title: Automotive Safety Integrity Level + link: https://en.wikipedia.org/wiki/Automotive_Safety_Integrity_Level + type: documentation + - resource: + title: What is Functional Safety? + link: https://www.youtube.com/watch?v=R0CPzfYHdpQ + type: video + - resource: + title: Eclipse Zenoh + link: https://github.com/eclipse-zenoh/zenoh type: documentation - resource: title: Eclipse Cyclone DDS link: https://github.com/eclipse-cyclonedds/cyclonedds type: documentation + ### FIXED, DO NOT MODIFY