# Querying Data from StreamCI

## 📡 Set URL, Request Headers, and Authentication Information

To retrieve data, we'll send a query to the `/query` endpoint. The response will be returned in [NDJSON](http://ndjson.org/) (Newline-Delimited JSON) format, so we’ll parse each line into a JSON object.

---

#### 🧠 `query(payload)` – Send a Query and Parse NDJSON Response

In [None]:
import requests
import json

URL = "https://api.streamci.org/query"
HEADERS = {"Content-Type": "application/json"}
AUTH_INFO = {
    "target": "userXX",
    "authtype": "secret",
    "secret_key": "<password>"
}

def query(payload):
    response = requests.post(URL, json=payload, headers=HEADERS)
    ndjson_text = response.text  # Get response as raw text (NDJSON)
    lines = ndjson_text.strip().splitlines()  # Split by line
    json_array = [json.loads(line) for line in lines] # Parse each line into a JSON object
    return json_array

---

## 📦 Sample Query Payload – Retrieve All Sensor Data

The payload below sends a `query` request to retrieve all records from the server, projecting only selected fields and sorting by `deviceID` and `time`.

This returns all matching documents in a structured list. You can optionally add a `"limit"` parameter to restrict the number of records returned.


In [None]:
payload = {
    "auth": AUTH_INFO,
    "request": {
        "method": "query",
        "query": {},
        "project": ["time", "deviceID", "val1", "val2"],
        "sort": {"deviceID":1, "time":1},
        # "limit": 10
        
    }
}
response = query(payload)
print("Responses: ", len(response), "records")

---

## 📊 Basic Data Analysis of Queried Results

After retrieving data from the `query` method, we can perform a simple analysis to understand sensor behavior.

#### 🧮 Convert to DataFrame and Compute Statistics

The code below:
- Converts the list of JSON objects into a Pandas DataFrame
- Parses the `time` column as datetime
- Groups by `deviceID` and summarizes `val1` and `val2`


This provides insights like **mean**, **standard deviation**, **min/max**, and **quartiles** for each sensor’s values.

In [None]:
import pandas as pd

# Convert to DataFrame
df = pd.DataFrame(response)

# Convert 'time' to datetime format
df["time"] = pd.to_datetime(df["time"])

# Group by deviceID and describe val1 and val2
summary = df.groupby("deviceID")[["val1", "val2"]].describe()

# Print summary
print(summary.to_string())

---

## 📈 Visualize Sensor Readings Over Time

We can plot `val2` over time to observe how each sensor behaves visually.

#### 🖼️ Scatter Plot by Sensor

The following code:

- Loops through each unique `deviceID`
- Filters the DataFrame for that device
- Uses `matplotlib.pyplot.scatter()` to plot `val2` against time
- Applies a small dot size (`s=3`) and semi-transparency (`alpha=0.5`) for clarity


In [None]:
import matplotlib.pyplot as plt

# Plot dots for each sensor
plt.figure(figsize=(10, 5))
for sensor in df["deviceID"].unique():
    subset = df[df["deviceID"] == sensor]
    plt.scatter(subset["time"], subset["val2"], label=sensor, s=3, alpha=0.5)

plt.xlabel("Time (US/Eastern)")
plt.ylabel("val2")
plt.title("val2 Over Time by Sensor")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()