# Hackathon Query Notebook

Welcome to the hackathon! This notebook is your starting point for querying the main dataset.

It's designed to connect to a **remote Tentris server** that is already running and loaded with all the data. You do not need to run or install Tentris yourself.

**Your only task:** Follow the setup steps, run the cells, and start writing your own queries!

### Step 1: Setup Your Python Environment (One-Time Setup)

Before you install any libraries, it's highly recommended to create a Python "virtual environment" (venv). This keeps your project dependencies clean and separate from your system's Python.

**1. Create the venv (from your terminal):**
```bash
# Make sure you are in the same directory as this notebook
python3 -m venv .venv
```

**2. Activate the venv (from your terminal):**

*On macOS / Linux:*
```bash
source .venv/bin/activate
```

*On Windows (Command Prompt):*
```bash
.venv\Scripts\activate.bat
```

*On Windows (PowerShell):*
```bash
.venv\Scripts\Activate.ps1
```

**3. Set your VS Code Kernel:**
In the top-right corner of VS Code, click "Select Kernel" and choose the Python interpreter from your newly created `.venv` folder. (You may need to restart VS Code for it to appear).

Once your venv is active (you'll see `(.venv)` next to your terminal prompt), you can proceed to the next step (which will install `ipykernel` to make this kernel fully available to the notebook).

### Step 2: Install Dependencies

Run this cell once (with your venv active) to install the Python libraries needed to query the server.

In [None]:
# Use the '!' to run a shell command from the notebook
# We now include 'tentris' for the optimized bindings
!pip install rdflib pandas ipykernel tentris

Collecting tentris
  Using cached tentris-0.19.11b0-cp312-abi3-manylinux_2_34_x86_64.whl.metadata (3.5 kB)
Using cached tentris-0.19.11b0-cp312-abi3-manylinux_2_34_x86_64.whl (10.4 MB)
Installing collected packages: tentris
Successfully installed tentris-0.19.11b0


### Step 3: Imports and Configuration

This cell imports the libraries and sets up our connection variables.

In [None]:
import rdflib
import pandas as pd
import tentris # Required to register the Tentris store
from tentris import TentrisHTTPStore
from IPython.display import display, Markdown

# --- üí° IMPORTANT üí° ---
# This is the ONLY line you need to change.
# Set this to the server IP address provided by the hackathon organizers.
ENDPOINT_URL = "http://localhost:9080"
# ------------------------

# We will create our main 'graph' object in the next step
graph = None

### Step 4: Connect to the Server

This cell creates the `rdflib.Graph` object. It now uses the `TentrisHTTPStore`, which is optimized to work with the Tentris server.

**Note:** This cell also runs a test query (`ASK { ?s ?p ?o }`) to make sure the server is reachable. If this cell fails, please double-check the `ENDPOINT_URL` you set in Step 3.

In [None]:
display(Markdown(f"üöÄ Connecting to server at `{ENDPOINT_URL}`..."))

try:
    # Initialize the TentrisHTTPStore with the base endpoint URL.
    # This client is smart and knows how to find the /sparql and /stream endpoints.
    store = TentrisHTTPStore(ENDPOINT_URL)
    
    # Create the graph object
    graph = rdflib.Graph(store)
    
    # Run a simple test query to confirm the connection
    # This will raise an error if the server is unreachable
    graph.query("ASK { ?s ?p ?o }")
    
    display(Markdown("‚úÖ **Connection successful!** The database is ready to be queried."))
    
except Exception as e:
    display(Markdown(f"‚ùå **Connection Failed:** Could not connect to the server.\n\n*Details: {e}*\n\nPlease check the `ENDPOINT_URL` in Step 3 and ensure the server is running."))
    graph = None # Ensure graph is None if connection fails

üöÄ Connecting to server at `http://localhost:9080`...

‚úÖ **Connection successful!** The database is ready to be queried.

### Step 5: Run Example Queries

Now you're ready to go! The `graph` object is your gateway to the database.

Here are a few examples to get you started. You can (and should!) modify these and create new cells to write your own.

In [None]:
# Example 1: Count all triples in the database
# This is a good way to see how much data you're working with.

if graph:
    query_str_count = "SELECT (COUNT(*) AS ?totalTriples) WHERE { ?s ?p ?o }"
    
    display(Markdown("**Running query:** Counting all triples..."))
    
    results = graph.query(query_str_count)
    
    for row in results:
        display(Markdown(f"Total Triples in Database: **{row.totalTriples}**"))
else:
    display(Markdown("‚ö†Ô∏è Skipping query: Database not connected."))

**Running query:** Counting all triples...

Total Triples in Database: **2000000**

In [None]:
# Example 2: Show 10 triples as a table
# We can use pandas to display the results in a nice table.

if graph:
    query_str_limit = "SELECT * WHERE { ?s ?p ?o } LIMIT 10"
    
    display(Markdown("**Running query:** Getting 10 triples..."))
    
    # Run the query
    results = graph.query(query_str_limit)
    
    # Convert results to a list of dictionaries
    results_list = [row.asdict() for row in results]
    
    # Load into a pandas DataFrame and display
    df = pd.DataFrame(results_list)
    display(df)
    
else:
    display(Markdown("‚ö†Ô∏è Skipping query: Database not connected."))

**Running query:** Getting 10 triples...

Unnamed: 0,s,p,o
0,http://example.org/software/81257,http://schema.org/contentUrl,http://example.org/downloads/software81257.zip
1,http://example.org/software/81257,http://schema.org/datePublished,2007-06-02
2,http://example.org/software/81257,http://schema.org/url,http://example.org/software/81257/
3,http://example.org/software/81257,http://schema.org/dateCreated,2007-06-02
4,http://example.org/software/81257,http://schema.org/license,https://spdx.org/licenses/MIT.html
5,http://example.org/software/81257,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://schema.org/SoftwareSourceCode
6,http://example.org/software/81257,http://schema.org/name,Example Software 81257
7,http://example.org/software/81257,https://open-pulse.epfl.ch/ontology#repository...,https://open-pulse.epfl.ch/ontology#Software
8,http://example.org/software/81257,http://schema.org/codeRepository,http://github.com/exampleorg/repo81257
9,http://example.org/software/81257,http://schema.org/programmingLanguage,R


### Step 6: Your Turn! (Happy Hacking)

This is your canvas. Create new code cells below this one and start building your project.

**Tip:** Don't forget to add `LIMIT 10` to your queries while you are exploring, so you don't accidentally try to print a million rows!