Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[discussion] swarm id coverage of individual service node operator #479

Open
venezuela01 opened this issue Jul 24, 2023 · 4 comments
Open
Labels
enhancement New feature or request

Comments

@venezuela01
Copy link

This is not a bug, it is a reference and reminder for reasoning about the threat model of the Session network. (There is no way to submit an issue without the default bug label.)

The OPTF wallet currently controls 10% of nodes; however, it covers 131 unique swarm IDs, accounting for about 50% of all 263 swarm IDs.

The second-largest operator is L6qq, which controls 120 nodes but covers 95 unique swarm IDs, accounting for about 36% of all swarm IDs.

@venezuela01 venezuela01 added the bug Something isn't working label Jul 24, 2023
@jagerman jagerman added enhancement New feature or request and removed bug Something isn't working labels Jul 24, 2023
@venezuela01
Copy link
Author

venezuela01 commented Jul 24, 2023

The code below prints the number of nodes controlled by a wallet and the number of unique swarm IDs covered by these nodes.

import requests
from bs4 import BeautifulSoup
from concurrent.futures import ThreadPoolExecutor

def download_and_extract_swarm_id(sn_part):
    # Make a GET request to the URL "https://oxen.caliban.org/sn/sn_part"
    response = requests.get(f"https://oxen.caliban.org/sn/{sn_part}")

    # Parse the HTML of the downloaded page
    soup = BeautifulSoup(response.text, 'html.parser')

    # Find the span tag with title "Storage server swarm"
    span_tag = soup.find('span', title='Storage server swarm')

    # Extract the swarm_id
    swarm_id = span_tag.text.split(':')[-1].strip()

    return swarm_id

url = 'https://oxen.caliban.org/operator/L6qqGfKFFUAN7KBL8BzN5ThDWAxSLz7rBJsJxPGX85tjB3ihnMtzreuaSUyFfWTnTnGy553MPG6sNTMx8Yspy9CaJskD1xf'

# Fetch the HTML content
response = requests.get(url)
html_content = response.text

# Parse the HTML
soup = BeautifulSoup(html_content, 'html.parser')

# Find all <a> tags with class "no-ul"
a_tags = soup.find_all('a', class_='no-ul')

# Get the href attribute from each <a> tag
hrefs = [a.get('href') for a in a_tags]

# Filter hrefs that contain '/sn/' and get part after '/sn/'
sn_parts = [href.split('/sn/')[1] for href in hrefs if '/sn/' in href]

# Create a ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=10) as executor:
    # Use the executor to map the download_and_extract_swarm_id function onto the sn_parts
    swarm_ids = executor.map(download_and_extract_swarm_id, sn_parts)

swarm_id_list = list(swarm_ids)
print("number of nodes: ", len(swarm_id_list))
print("number unique swarm ids: ", len(set(swarm_id_list)))

@venezuela01
Copy link
Author

venezuela01 commented Jul 24, 2023

By the way, according to my tests, a simple simulation based on a random number generator can accurately predict the number of unique swarm IDs, based on the number of nodes owned by an operator.

import random

# Set to hold unique numbers
unique_numbers = set()
NUM_OF_NODES = 120
NUM_OF_SWARMS = 263

# Simulate swarm distribution
for _ in range(NUM_OF_NODES):
    number = random.randint(0, NUM_OF_SWARMS)
    unique_numbers.add(number)

# Print the number of unique numbers
print(f"The number of unique numbers is: {len(unique_numbers)}")

@venezuela01
Copy link
Author

venezuela01 commented Jul 28, 2023

This swarm ID distribution presents an intriguing potential issue.

An attacker intending to scrape Session user IDs and messages may only need to control 263 selective service nodes in the most optimal case. This scenario becomes feasible if the attacker can strategically bribe specific service node operators controlling the nodes with the desired Swarm IDs.

Consider a hypothetical scenario: a Session competitor creates a new app, "Tession." This competitor markets Tession as "Session-compatible," secretly operating a substantial number of Session service nodes. By doing so, they could clone messages from the Session storage server to their new Tession storage server. In the future, they could encourage Session users to restore their Session accounts on Tession, instantly gaining access to their previous social connections from the Session network. This would be akin to entering your phone number and importing your contacts to WhatsApp, but in a more decentralized and private way. Moreover, Tession could potentially bribe 263 Oxen service nodes, offering them Tession Network Tokens to stake on the Tession Network in exchange for helping scrape Session storage. This process can happen in a long time during the competitor trying to onboard Session users to their app gradually, a bidirectional bridge between Tession storage server and Session storage server can also be built in theory.

It's crucial to note that the network effect typically serves as a barrier preventing a later competitor from surpassing its predecessor. However, this rule is a bit weaker in a decentralized setting. The theoretical scraping attack described above highlights a possibility for a competitor to expend a relatively limited budget to "hard fork" an existing user base, akin to how Bitcoin Cash hard forked from Bitcoin. While this doesn't imply it would be easy for a competitor to convince users to switch and gain an advantage, it does lower the barrier and suggest the importance of maintaining alignment between the service node operator community and the development team. If service node operators share the same interests as the development team, they're less likely to sell user data to support a competitor. Notably, such an action wouldn't directly infringe upon user privacy, although it would contradict the initial assumptions we implicitly made about service node operators' behavior.

@venezuela01
Copy link
Author

@KeeJef

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants