# Normalization

In the expriment previsoly made we used 2 axis systems.

One relative to the APs, which was 4x4
One relative to the picos, which was 10x10

To this effect our first step to relativize our points will be to create 2 functions:
- One for the AP positions, which passes 4x4 to a normal value
- One for the pico positions, which passes 10x10 to a normal value

A normal value in our situation will consist of a value between -1,0 and 1
The origin for out axis system will be one of the APs
The maximum and minium of our normalization depend on the space between samples, therefore we will take our axis that goes from 0 to 1 and subdivide it into 10 segments, give we have 10 pico samples.
The size of one 1 of the segments will be out unit in the final normalized space (both in lenght and width)
With this in mind we will have 100 areas with 1 segment in length and width
In each triangle configuration we will then move the origin to one of ther vertices and map out the datapoints corresponding to where the land in regards to the origin taking into account the coordinate system made from the segments/areas.
These will be our normalized points

# Collect data

In [69]:
from pymongo import MongoClient
from IPython.display import display, Markdown
from datetime import datetime

client = MongoClient("mongodb://localhost:28910/")
db = client["wifi_data_db"]
collection = db["wifi_client_data"]


AP_BSSID = {
    "ec:01:d5:2b:5f:e0": "Freind1",
    "ec:01:d5:27:1d:00": "Freind2",
    "ec:01:d5:28:fa:c0": "Freind3",
}

In [70]:

def normalize_picos_coordinates(x, y, origin_x, origin_y):

 
    # Normlized interval sizes
    pico_interval = 1 / 10 
    ap_interval = 1/4   

    # Normalized locations
    normalized_x = pico_interval * x
    normalized_y = pico_interval * y
    normalized_origin_x = ap_interval * origin_x
    normalized_origin_y = ap_interval * origin_y
    
    
    return (normalized_x-normalized_origin_x, normalized_y-normalized_origin_y)


def calculate_centroid(point1, point2, point3):
    cx = (point1[0] + point2[0] + point3[0]) / 3
    cy = (point1[1] + point2[1] + point3[1]) / 3
    return (cx, cy)

In [71]:
from pymongo import MongoClient
from datetime import datetime

def transform_wifi_data(db, origin_x=None, origin_y=None, start_time=None, end_time=None, dry_run=False):
    """
    Transform wifi scan data into filtered format with normalized coordinates.
    
    Args:
        db: MongoDB database object
        origin_x: Origin x-coordinate for normalization
        origin_y: Origin y-coordinate for normalization
        start_time: datetime object for start of time range (inclusive)
        end_time: datetime object for end of time range (inclusive)
        dry_run: If True, only preview changes without writing to DB
    """
    # Define the BSSID to AP mapping
    ap_mapping = {
        "ec:01:d5:2b:5f:e0": "AP1_rssi",
        "ec:01:d5:27:1d:00": "AP2_rssi",
        "ec:01:d5:28:fa:c0": "AP3_rssi"
    }
    
    # Custom mapping of IP endings to y positions
    ip_to_y = {
        31: 1, 32: 2, 33: 3, 34: 4, 35: 5,
        36: 6, 37: 7, 38: 8, 39: 9, 30: 10
    }
    
    # Convert datetime objects to Unix timestamps if provided
    match_stage = {}
    if start_time:
        match_stage["timestamp"] = {"$gte": start_time.timestamp()}
    if end_time:
        match_stage.setdefault("timestamp", {})["$lte"] = end_time.timestamp()
    
    pipeline = [
        # Time filtering if dates provided
        {"$match": match_stage} if match_stage else {"$match": {}},
        # Transform each document
        {
            "$addFields": {
                "ip_ending": {
                    "$toInt": {"$arrayElemAt": [{"$split": ["$metadata.pico_ip", "."]}, 3]}
                }
            }
        },
        {
            "$project": {
                "_id": 0,
                "raw_location_x": "$metadata.button_id",
                "raw_location_y": {
                    "$switch": {
                        "branches": [
                            {"case": {"$eq": ["$ip_ending", 31]}, "then": 1},
                            {"case": {"$eq": ["$ip_ending", 32]}, "then": 2},
                            {"case": {"$eq": ["$ip_ending", 33]}, "then": 3},
                            {"case": {"$eq": ["$ip_ending", 34]}, "then": 4},
                            {"case": {"$eq": ["$ip_ending", 35]}, "then": 5},
                            {"case": {"$eq": ["$ip_ending", 36]}, "then": 6},
                            {"case": {"$eq": ["$ip_ending", 37]}, "then": 7},
                            {"case": {"$eq": ["$ip_ending", 38]}, "then": 8},
                            {"case": {"$eq": ["$ip_ending", 39]}, "then": 9},
                            {"case": {"$eq": ["$ip_ending", 30]}, "then": 10}
                        ],
                        "default": None
                    }
                },
                "data": 1,
                "timestamp": 1
            }
        },
        {"$match": {"raw_location_y": {"$ne": None}}},  # Filter out invalid IPs
        # Unwind the data array
        {"$unwind": "$data"},
        # Filter only the APs we're interested in
        {"$match": {"data.BSSID": {"$in": list(ap_mapping.keys())}}},
        # Group by original document and create AP fields
        {
            "$group": {
                "_id": {
                    "raw_location_x": "$raw_location_x",
                    "raw_location_y": "$raw_location_y",
                    "timestamp": "$timestamp"
                },
                **{
                    field_name: {
                        "$max": {
                            "$cond": [
                                {"$eq": ["$data.BSSID", bssid]},
                                "$data.RSSI",
                                None
                            ]
                        }
                    }
                    for bssid, field_name in ap_mapping.items()
                }
            }
        }
    ]
    
    # Execute aggregation and process results
    results = list(collection.aggregate(pipeline))
    
    # Apply normalization to each document
    normalized_results = []
    for doc in results:
        raw_x = doc["_id"]["raw_location_x"]
        raw_y = doc["_id"]["raw_location_y"]
        
        # Apply normalization (origin defaults to 0 if not provided)
        norm_x, norm_y = normalize_picos_coordinates(
            raw_x, raw_y,
            origin_x if origin_x is not None else 0,
            origin_y if origin_y is not None else 0
        )
        
        # Create new document with normalized coordinates
        if dry_run:
            new_doc = {
            "raw_location_x": raw_x,
            "raw_location_y": raw_y,
            "location_x": norm_x,
            "location_y": norm_y,
            "timestamp": doc["_id"]["timestamp"],
            **{field: doc.get(field) for field in ap_mapping.values()}
        }
        else:
            new_doc = {
                "location_x": norm_x,
                "location_y": norm_y,
                "timestamp": doc["_id"]["timestamp"],
                **{field: doc.get(field) for field in ap_mapping.values()}
            }
        normalized_results.append(new_doc)
    
    if dry_run:
        print(f"Dry run: Would process {len(normalized_results)} documents")
        if normalized_results:
            print("Sample documents (with normalized coordinates):")
            for i, doc in enumerate(normalized_results[:min(5, len(normalized_results))]):
                print(f"Document {i+1}:")
                print(f"  Raw location: {doc['raw_location_x']:.4f},{doc['raw_location_y']:.4f}")
                print(f"  Normalized location: {doc['location_x']:.4f},{doc['location_y']:.4f}")
                print(f"  timestamp: {datetime.fromtimestamp(doc['timestamp'])}")
                for ap in ap_mapping.values():
                    print(f"  {ap}: {doc.get(ap, 'N/A')}")
                print()
        return normalized_results
    
    # Create or update the filtered collection
    if normalized_results:
        db.wifi_data_filtered.delete_many({})  # Clear existing data
        db.wifi_data_filtered.insert_many(normalized_results)
        print(f"Successfully processed {len(normalized_results)} documents into wifi_data_filtered")
        return normalized_results
    else:
        print("No documents matched the criteria")
        return []


In [None]:

# Connect to MongoDB
client = MongoClient('mongodb://localhost:27017/')

# 
triangle_dictionary = {
    "reto_grande": {
        "start":datetime(2025, 5, 13, 20, 10),
        "end":datetime(2025, 5, 13, 21, 42),
        "origin":calculate_centroid((0,0),(4,0),(0,4))
    },
    "reto_medio": {
        "start":datetime(2025, 5, 13, 21, 46),
        "end":datetime(2025, 5, 13, 22, 49),
        "origin":calculate_centroid((1,1),(3,1),(1,3))
    },
    "reto_pequeno": {
        "start":datetime(2025, 5, 13, 22, 51),
        "end":datetime(2025, 5, 13, 22, 53),
        "origin":calculate_centroid((1,1),(2,1),(1,2))
    },
}


current_triangle = triangle_dictionary["reto_medio"]
start_time  = current_triangle["start"]
end_time    = current_triangle["end"]
origin      = current_triangle["origin"]
print("\nDate filtered example:")
transform_wifi_data(db, origin[0], origin[1], start_time, end_time, dry_run=False)


Date filtered example:
Dry run: Would process 21140 documents
Sample documents (with normalized coordinates):
Document 1:
  Raw location: 4.0000,8.0000
  Normalized location: -0.0167,0.3833
  timestamp: 2025-05-13 22:29:52.022088
  AP1_rssi: -62
  AP2_rssi: None
  AP3_rssi: -47

Document 2:
  Raw location: 9.0000,1.0000
  Normalized location: 0.4833,-0.3167
  timestamp: 2025-05-13 21:53:10.367172
  AP1_rssi: -67
  AP2_rssi: -54
  AP3_rssi: -65

Document 3:
  Raw location: 7.0000,2.0000
  Normalized location: 0.2833,-0.2167
  timestamp: 2025-05-13 22:15:16.852819
  AP1_rssi: -67
  AP2_rssi: -59
  AP3_rssi: None

Document 4:
  Raw location: 6.0000,4.0000
  Normalized location: 0.1833,-0.0167
  timestamp: 2025-05-13 22:18:39.673155
  AP1_rssi: -59
  AP2_rssi: -56
  AP3_rssi: None

Document 5:
  Raw location: 6.0000,4.0000
  Normalized location: 0.1833,-0.0167
  timestamp: 2025-05-13 22:18:44.071469
  AP1_rssi: -58
  AP2_rssi: -56
  AP3_rssi: -68



[{'raw_location_x': 4,
  'raw_location_y': 8,
  'location_x': -0.016666666666666663,
  'location_y': 0.38333333333333336,
  'timestamp': 1747171792.022088,
  'AP1_rssi': -62,
  'AP2_rssi': None,
  'AP3_rssi': -47},
 {'raw_location_x': 9,
  'raw_location_y': 1,
  'location_x': 0.48333333333333334,
  'location_y': -0.31666666666666665,
  'timestamp': 1747169590.3671715,
  'AP1_rssi': -67,
  'AP2_rssi': -54,
  'AP3_rssi': -65},
 {'raw_location_x': 7,
  'raw_location_y': 2,
  'location_x': 0.2833333333333334,
  'location_y': -0.21666666666666667,
  'timestamp': 1747170916.852819,
  'AP1_rssi': -67,
  'AP2_rssi': -59,
  'AP3_rssi': None},
 {'raw_location_x': 6,
  'raw_location_y': 4,
  'location_x': 0.1833333333333334,
  'location_y': -0.016666666666666663,
  'timestamp': 1747171119.673155,
  'AP1_rssi': -59,
  'AP2_rssi': -56,
  'AP3_rssi': None},
 {'raw_location_x': 6,
  'raw_location_y': 4,
  'location_x': 0.1833333333333334,
  'location_y': -0.016666666666666663,
  'timestamp': 17471711