# Access4All — Final Interface (Top-K Listing View)

This notebook implements the final, deterministic interface for Access4All.

Users select a **city** and **Top-K**, and listings are ranked by:
1. Final accessibility score
2. Confidence
3. Property ID (tie-breaker)

No filtering or scoring is performed here — this notebook only
presents results from the precomputed `access4all_final_scores`.
### How to run

1. Insert your Azure SAS token at Data Loading cell (data access only)
2. Run the notebook top to bottom
3. Select city and Top-K when prompted

To re-run with different widget selections, re-run the cell starting from the **Input Handling cell**.

Loaded tables:
- `access4all_final_scores`
- `access4all_airbnb_v1_9cities` (UI enrichment only)

All tables are loaded from a shared Azure Blob Storage container (main project storage).



In [0]:
from pyspark.sql.functions import split, col, round

### Data Loading (Shared Storage)

The interface loads data directly from shared Azure Blob Storage using SAS authentication.  
Loaded files: `access4all_final_scores.csv`, `access4all_airbnb_v1_9cities.csv`.


In [0]:
storage_account = "lab94290"
container = "submissions"
group = "merian_amit_ward"

sas_token = "Insert Your SAS Token Here"
sas_token = sas_token.lstrip("?")

spark.conf.set(f"fs.azure.account.auth.type.{storage_account}.dfs.core.windows.net", "SAS")
spark.conf.set(
    f"fs.azure.sas.token.provider.type.{storage_account}.dfs.core.windows.net",
    "org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider"
)
spark.conf.set(
    f"fs.azure.sas.fixed.token.{storage_account}.dfs.core.windows.net",
    sas_token
)

# --- final scores (base table) ---
final_scores_path = (
    f"abfss://{container}@{storage_account}.dfs.core.windows.net/"
    f"{group}/tables/access4all_final_scores.csv"
)

df_final_scores = (
    spark.read
    .option("header", "true")
    .option("inferSchema", "true")
    .csv(final_scores_path)
)

# --- v1 cities (for late join only) ---
v1_cities_path = (
    f"abfss://{container}@{storage_account}.dfs.core.windows.net/"
    f"{group}/tables/access4all_airbnb_v1_9cities.csv"
)

df_v1_cities = (
    spark.read
    .option("header", "true")
    .option("inferSchema", "true")
    .csv(v1_cities_path)
    # normalize URL (remove query params)
    .withColumn("url", split(col("url"), "\\?")[0])
    # deduplicate by logical identity
    .dropDuplicates(["property_id", "url"])
)
df_final_scores = (
    df_final_scores
    .dropDuplicates(["property_id"])
)



### User Inputs

The following controls allow users to filter listings by city and accessibility needs, and to select how many top-ranked results (`Top K`) to display.


In [0]:
# =========================
# Access4All – User Inputs
# =========================

dbutils.widgets.dropdown(
    "city",
    "Paris",
    ["Paris", "Rome", "Dubai", "São Paulo", "Rio de Janeiro", "Los Angeles", "New York", "Las Vegas", "San Francisco"],
    "City"
)

dbutils.widgets.text("top_k", "5", "Top K results")


### Input Handling

Widget selections are parsed and converted into typed parameters that drive filtering and ranking in the final result set.


 Re-run from here after changing widget values


In [0]:
# =========================
# Read widget values
# =========================

selected_city = dbutils.widgets.get("city")
top_k = int(dbutils.widgets.get("top_k"))

###Filtering And Presenting user with final results###

In [0]:
from pyspark.sql.functions import col, round, when

df_topk = (
    df_final_scores
    .filter(col("city") == selected_city)
    .dropDuplicates(["property_id"])
    .orderBy(col("final_score").desc(), col("final_confidence").desc(), col("property_id").desc())
    .limit(top_k)
)
df_topk = (
    df_topk
    .withColumn(
        "limiting_layer_explained",
        when(col("limiting_layer") == "v1",
             "V1 – Listing attributes (e.g., elevator, step-free access). Possible attributes are missing or not supportive.")
        .when(col("limiting_layer") == "v2",
             "V2 – Nearby environment (OSM wheelchair POIs). Surroundings may lack supportive accessible infrastructure.")
        .when(col("limiting_layer") == "v3",
             "V3 – Terrain and slope. Steep terrain may limit practical accessibility.")
        .otherwise("No dominant limiting layer identified.")
    )
)







In [0]:


df_result = (
    df_topk
    .join(df_v1_cities.select("property_id", "url", "location", "details"), on="property_id", how="left")
    .withColumn("access4all_score", round(col("final_score") * 100, 1))
    .select(
        col("property_id").alias("Listing ID"),
        col("url").alias("Listing URL"),
        col("access4all_score").alias("Access4All Score (0–100)"),
        col("city").alias("City"),
        col("location").alias("Location"),
        col("details").alias("Listing Details"),
        col("limiting_layer_explained").alias("Primary Accessibility Limitation"),
        col("reasons_json").alias("Limiting Factors (Details)"),
        col("strengths_json").alias("Accessibility Strengths"),
    )
    .dropDuplicates(["Listing ID"])
    .orderBy(col("Access4All Score (0–100)").desc())
)

print(f"Showing top {top_k} listings in {selected_city} (ranked by accessibility score).")
print("Access4All score is shown on a 0–100 scale, where higher means more accessible.")
display(df_result)
print(
    "An Access4All score of 0 means that no reliable accessibility support could be established for the listing. "
    "This does not necessarily indicate inaccessibility, but rather insufficient or conflicting evidence. "
    "Scores of 0 may result from missing accessibility information or from violations of hard-coded accessibility constraints "
    "(e.g., lack of step-free access or elevator support). "
    "Where available, each listing includes explicit accessibility strengths and limiting reasons to provide transparency. "
    "To avoid false positive recommendations, the system takes a conservative approach and assigns a score of 0 "
    "whenever accessibility support cannot be confidently established."
)


