Skip to content

A simple microservices project in Python to calculate an average in a distributed way, demonstrating observability with OpenTelemetry and Jaeger.

License

Notifications You must be signed in to change notification settings

ronihdzz/opentelemtry-microservices

Repository files navigation

🧮 Opentelemtry Microservices

Distributed Average Calculation with Microservices

Language / Idioma

📋 What does this project do?

This project implements a distributed average calculation system using 3 independent microservices. Instead of calculating the average in a single place, it divides the task into parts and demonstrates how services communicate with each other, allowing you to see all this interaction using OpenTelemetry and Jaeger.

The problem it solves:

Input: A list of numbers [10, 20, 30, 40, 50]
Output: The average 30.0
Difference: It does it in a distributed way across 3 different APIs

🌟 Why Distributed Systems?

In modern software architecture, microservices offer several advantages:

  • 🔧 Scalability: Each service can scale independently
  • 🛠️ Technology flexibility: Different services can use different technologies
  • 👥 Team autonomy: Different teams can work on different services
  • 🔄 Fault isolation: If one service fails, others can continue working

However, with these benefits comes complexity: How do you track a request that travels through multiple services? How do you identify which service is slow? This is where observability becomes crucial.

🔍 The Power of OpenTelemetry + Jaeger

OpenTelemetry and Jaeger solve the distributed systems observability challenge:

🎯 What OpenTelemetry provides:

  • 📊 Distributed tracing: Follow a request across all services
  • 🏷️ Automatic instrumentation: No need to manually add logging everywhere
  • 📋 Rich metadata: Context about each operation (service, cluster, version, etc.)
  • 🔗 Correlation: Connect logs, metrics, and traces with the same request ID

🕵️ What Jaeger offers:

  • 👀 Visual timeline: See exactly where time is spent
  • 🗺️ Service map: Understand service dependencies automatically
  • ⚡ Performance insights: Identify bottlenecks instantly
  • 🐛 Error tracking: See exactly where and why failures occur

The result: Instead of guessing or manually digging through logs across multiple services, you get a complete visual story of every request.

🏗️ The 3 Microservices

1. 🎯 API Average - The Orchestrator (Port 9001)

  • Endpoint: POST /average
  • Function: Receives the list of numbers and coordinates the entire process
  • Doesn't calculate anything directly, only orchestrates calls to other services

2. ➕ API Add - The Adder (Port 9002)

  • Endpoint: POST /add
  • Function: Receives a list of numbers and returns their sum
  • Example: [10, 20, 30, 40, 50]150

3. ➗ API Divide - The Divider (Port 9003)

  • Endpoint: POST /divide
  • Function: Receives two numbers and performs the division
  • Example: 150 ÷ 530.0

🔄 Step-by-Step Execution Flow

sequenceDiagram
    %% The user initiates the call to the /average endpoint
    participant User as User (Client)
    participant APIAverage as api_average
    participant APIAdd as api_add
    participant APIDivide as api_divide

    %% 1. The user makes a POST /average request with the list of numbers
    User ->> APIAverage: POST /average<br>Body: { numbers: [...] }

    note over APIAverage: <b>api_average</b><br>Receives the list of numbers.<br>Needs to get the sum and then the average.

    %% 2. api_average requests the sum of numbers from api_add
    APIAverage ->> APIAdd: POST /add<br>Body: { numbers: [...] }

    note over APIAdd: <b>api_add</b><br>Calculates the sum of all<br>received numbers and returns the result.

    %% 3. api_add responds with the sum of numbers
    APIAdd -->> APIAverage: 200 OK<br>Body: { result: sum }

    note over APIAverage: <b>api_average</b><br>Receives the sum (sum_numbers).

    %% 4. api_average requests the division (sum_numbers ÷ total_numbers) from api_divide
    APIAverage ->> APIDivide: POST /divide<br>Body: { divide: sum_numbers,<br>  divindend: len(numbers) }

    note over APIDivide: <b>api_divide</b><br>Performs the division operation and<br>returns the result.

    %% 5. api_divide responds with the average
    APIDivide -->> APIAverage: 200 OK<br>Body: { result: average }

    note over APIAverage: <b>api_average</b><br>Receives the average and returns it<br>to the end user.

    %% 6. api_average returns the average to the user
    APIAverage -->> User: 200 OK<br>Body: { result: average }
Loading

📝 Detailed flow explanation:

Step 1: You make a request to http://localhost:9001/average

POST /average
{
  "numbers": [10, 20, 30, 40, 50]
}

Step 2: api_average doesn't know how to sum, so it asks api_add:

POST http://localhost:9002/add
{
  "numbers": [10, 20, 30, 40, 50]
}

Step 3: api_add responds with the sum:

{
  "result": 150
}

Step 4: api_average now needs to divide 150 ÷ 5, so it asks api_divide:

POST http://localhost:9003/divide
{
  "divide": 150,
  "divindend": 5
}
```L

**Step 5**: `api_divide` responds with the result:
```json
{
  "result": 30.0
}

Step 6: api_average finally responds to you:

{
  "result": 30.0
}

🚀 How to run the project

1. Install dependencies

pip install -r requirements.txt

2. Start Jaeger

docker-compose up -d jaeger

3. Run the 3 microservices

The execute_all_apis.py file is the key to the project:

# What execute_all_apis.py does:

def run_api_1():
    # Starts api_average on localhost:9001
    uvicorn.run("api_average:app", host="localhost", port=9001)

def run_api_2():
    # Starts api_add on localhost:9002  
    uvicorn.run("api_add:app", host="localhost", port=9002)

def run_api_3():
    # Starts api_divide on localhost:9003
    uvicorn.run("api_divide:app", host="localhost", port=9003)

# Creates 3 parallel processes, one for each API
processes = []
processes.append(multiprocessing.Process(target=run_api_1))
processes.append(multiprocessing.Process(target=run_api_2))  
processes.append(multiprocessing.Process(target=run_api_3))

# Starts them all at the same time
for process in processes:
    process.start()

Execute:

python execute_all_apis.py

You'll see something like:

Starting API configuration
Starting average API at localhost:9001
Starting add API at localhost:9002  
Starting divide API at localhost:9003

4. Make a test request

curl -X POST "http://localhost:9001/average" \
     -H "Content-Type: application/json" \
     -d '{"numbers": [10, 20, 30, 40, 50]}'

👀 What you'll see in Jaeger

Access Jaeger UI

Open: http://localhost:16686/

Search for your trace

  1. In Service select: api_average
  2. Click Find Traces
  3. You'll see your request listed

What the complete trace shows:

🔍 Trace: calculation-average-[timestamp]
├── 📊 api_average: POST /average (80ms total)
│   ├── 🔗 HTTP Request: POST /add → api_add (30ms)
│   │   └── 📊 api_add: POST /add (25ms)
│   │       └── ✅ Sum calculated: 150  
│   └── 🔗 HTTP Request: POST /divide → api_divide (20ms)
│       └── 📊 api_divide: POST /divide (15ms)
│           └── ✅ Division calculated: 30.0
└── ✅ Final response: 30.0

Specific details you'll see:

🎯 In Timeline view:

  • Span 1: api_average POST /average (parent span)
    • Span 2: HTTP POST http://localhost:9002/add (child)
      • Span 3: api_add POST /add (grandchild)
    • Span 4: HTTP POST http://localhost:9003/divide (child)
      • Span 5: api_divide POST /divide (grandchild)

📋 In Tags you'll see:

  • service.name: api_average, api_add, api_divide
  • http.method: POST
  • http.url: http://localhost:9002/add, etc.
  • http.status_code: 200
  • cluster: cluster_1, cluster_2, cluster_3
  • datacentre: datacentre_1

⏱️ In Timings:

  • How long each service took to respond
  • Total time the entire process took
  • Where potential bottlenecks are

Service Map View

Jaeger will also show you a visual map:

[Client] → [api_average] → [api_add]
              ↓
         [api_divide]

🧪 Experiment with the project

Try with different numbers:

# Small numbers
curl -X POST "http://localhost:9001/average" \
     -H "Content-Type: application/json" \
     -d '{"numbers": [1, 2, 3]}'

# Many numbers  
curl -X POST "http://localhost:9001/average" \
     -H "Content-Type: application/json" \
     -d '{"numbers": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}'

Each request will generate a new trace in Jaeger that you can analyze.

📊 Project-specific configuration

Each service has its own configuration in the initialize_telemetry() function:

api_average:

service_name="api_average"
cluster="cluster_1"  
datacentre="datacentre_1"

apLi_add:

service_name="api_add"
cluster="cluster_2"
datacentre="datacentre_1"  

api_divide:

service_name="api_divide"
cluster="cluster_3"
datacentre="datacentre_1"

This allows you to see in Jaeger how each service is identified and distributed.

🔧 Specific troubleshooting

If you don't see traces in Jaeger:

  1. Verify that the 3 services are running: ps aux | grep python
  2. Make a request: curl -X POST "http://localhost:9001/average" -H "Content-Type: application/json" -d '{"numbers": [1,2,3]}'
  3. Wait 10-15 seconds and search in Jaeger

If api_average gives error 500:

  • Verify that api_add responds: curl -X POST "http://localhost:9002/add" -H "Content-Type: application/json" -d '{"numbers": [1,2,3]}'
  • Verify that api_divide responds: curl -X POST "http://localhost:9003/divide" -H "Content-Type: application/json" -d '{"divide": 6, "divindend": 3}'

About

A simple microservices project in Python to calculate an average in a distributed way, demonstrating observability with OpenTelemetry and Jaeger.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages