In [1]:
import numpy as np
import csv
from os import path
import random
import folium

# Privacy Evaluation:

## Assumptions and adversarial models

Assumptions and adversarial models are essential in privacy analysis as they enable us to identify vulnerabilities, evaluate privacy guarantees, enhance threat modeling, and guide system design. By considering a wide range of potential threats and adversaries, we can develop robust privacy solutions that protect individuals' personal information and maintain the confidentiality, integrity, and availability of sensitive data.

### Assumptions

The first and basic assumption for our privacy evaluation is that each IP address in the data set corresponds to a unique user, and users don't hide their IP address or use additional pivacy techniques. This assumption is Important to consider because we need to protect the privacy of all the users, even those who do not use additional protections.

.. other assumptions

### Adversarial models

An adversary could aim to determine the geographical locations of users based on the location of their requests. He can use some algorithms to approximatate the location of where some users live or work based on the requests timestamp and location.


... Adversarial models


## Attack strategy

There are the two main types of attacks :

1. Location-based Attack: This type of attack aims to determine the geographical position of a user based on the latitude and longitude coordinates associated with the requests. The strategies for this attack include:

- Extract the multiple queries locations of each user based on the IP address, which is unique to each user.
- Using an algorithm to compute an approximation of the geographical locations based on the mutliple queries location and timestamp.


2. Interest-based Attack: This attack focuses on inferring a user's interests by analyzing the types of Point of Interest (POI) filtered by the user in their queries. The strategies for this attack include:

- Analyzing the requested POI types to infer the user's personal interests. For example, if a user consistently queries for restaurants, it can be assumed that they are interested in gastronomy.
- Examining repeated queries for a specific type of POI to deduce user preferences or regular habits. For instance, if a user frequently visits the same restaurant, it indicates their preference or routine.


## Demonstration of the attacks

### Location-based Attack

Let start our attack by Extracting the informations needed for this attack, and group the data by IP addresses :

In [2]:
# Extract the data IP and location of the users

with open("queries.csv", 'r') as fichier_csv:
    lecteur_csv = csv.reader(fichier_csv, delimiter=' ')
    
    prochaine_ligne = next(lecteur_csv)
    
    user_data = []
    # Read lines of the CSV file
    for line in lecteur_csv:
        user_data.append([line[0], line[1], line[2]])
        
# Group the data by IP
ip_map = {}
for entry in user_data:
    ip = entry[0]
    coordinates = entry[1:]
    
    if ip in ip_map:
        ip_map[ip].append(coordinates)
    else:
        ip_map[ip] = [coordinates]

Now that the extraction is done, let's get the data of one particular user for the demonstration. The choosen user is the one that correspond to the IP address "34.101.177.245". 

In [3]:
# Get the data of one user :
data = ip_map['34.101.177.245']
latitude_sum = 0.0
longitude_sum = 0.0
count = 0

# Compute the means
for coordinates in data:
    latitude_sum += float(coordinates[0])
    longitude_sum += float(coordinates[1])
    count += 1
latitude_mean = latitude_sum / count
longitude_mean = longitude_sum / count

# Create a map with the locations of the requests of the particular user
carte = folium.Map(location=[latitude_mean, longitude_mean], zoom_start=12)

for coordinates in data:
    latitude = float(coordinates[0])
    longitude = float(coordinates[1])
    
    # Add a vraiation to the data to see the multiple point at same places
    delta1 = random.uniform(-0.0003, 0.0003)
    delta2 = random.uniform(-0.0003, 0.0003)
    new_latitude = latitude + delta1
    new_longitude = longitude + delta2
    
    folium.Marker(
        location=[new_latitude, new_longitude],
        popup="Latitude: {}<br>Longitude: {}".format(latitude, longitude)
    ).add_to(carte)
    
folium.Marker([latitude_mean, longitude_mean], popup="Moyenne :<br> - Latitude: {}<br>- Longitude: {}".format(latitude, longitude)).add_to(carte)

# Save the map
carte.save('carte.html')

print("Moyenne des coordonnées :")
print("Latitude : ", latitude_mean)
print("Longitude : ", longitude_mean)

Moyenne des coordonnées :
Latitude :  46.54256554474061
Longitude :  6.5999208195069015


Now we can see all the locations where the user has make a query. We detect three main locations, the one where multiple points belongs, which probably corresponds to living and working locations. Moreover we can suppose that the user has some interest for the Lausanne University.

![Texte alternatif](map_user.png)

We made this small and basic algorithm quickly, but it is possible to go much further and to use the timestamp in particular to have better and more precise analysis.

### Interest-based Attack
...