# <span style="color:blue">SAML-SSO Guide & Template</span>

This **SAML-SSO Template Guide** is presented by the <span style="color:navy">**UCSD TritonGPT Security Team**</span> for the purpose of **Danswer's OneLogin-SAML-SSO integration**. It is important to acknowledge that this guide serves only as an example, and private information has been excluded to ensure security.

<div style="text-align: right;">
  <strong>- Edwin Ruiz</strong>
</div>

![pic](pics/security_team.png)

# <span style="color:blue">1) Cert & Key</span>
- **A. Generating Cert & Key:**
  - **i. Open terminal & enter command:**

  ```bash
  openssl req -x509 -sha256 -nodes -days 365 -newkey rsa:2048 -keyout privateKey.key -out certificate.crt
  ```
  - **ii. Fill out information when prompted:**
    <img src="pics/cert.png" alt="Certificate">
    
    <img src="pics/cat_cert.png" alt="Certificate">
    
    
    
- **B. Verify Certificate**
  - **i. Enter the following command on terminal:**

    ```bash
    openssl x509 -in certificate.crt -text -noout | awk '/Data:/, /Public Key Algorithm:/ {print}'
    ```

    <img src="pics/verifying_cert.png" alt="Certificate Verification">


- **C. Verify Key:**
  - **i. Enter the following command on terminal:**

    ```bash
    openssl rsa -in privateKey.key -check 2>&1 | grep -E "RSA key ok|writing RSA key"
    ```

    <img src="pics/verifying_key.png" alt="Key Verification">


- **D. Verifying Cert & Key Match:**
  - **i. Enter the following command on terminal:**

    ```bash
    [ "$(openssl x509 -noout -modulus -in certificate.crt | openssl md5)" = "$(openssl rsa -noout -modulus -in privateKey.key | openssl md5)" ] && echo "Certificate and key match" || echo "Certificate and key do not match"
    ```

    <img src="pics/verifying_cert&key_match.png" alt="Cert & Key Match Verification">



# <span style="color:blue">2) Configure `sp.xml` (metadata) file</span>

Here is the provided-prefilled `sp.xml` metadata file:

![pic](pics/sp.xml_prefilled.png)



- **A. Request Stage & Prod IdP `sp.xml` file**
  - Message your I.T. Department or Identity and Access Management (IAM) team to request your institutions/organizations "IdP `sp.xml` metadata file" for both "Stage" and "PROD"

  - If an academic institution, they typically use "Shibboleth" for SSO as the standard

  - "Stage" IdP `sp.xml` metadata file will be used for local, dev, and stage

  - "Prod" IdP `sp.xml` metadata file will be used for PROD only
  
    <span style="color:red">Please note that your organization's `sp.xml` metadata can be provided either as a file or via an endpoint link</span>


- **B. Prefilled `sp.xml` metadata file is provided in repo**
  - Lines 1, 15, 30, 32, and 37 are standard protocols

  - However, "Binding" in lines 30 & 32 should be verified with the "IdP `sp.xml` metadata file" provided by the IAM team or I.T. department

  - Here is an example of what an "IdP `sp.xml` metadata file" should look like:

    <span style="color:red">Note that the following is only a sample and all sensitive information has been removed for security & privacy purposes</span>

    <img src="pics/IdP_sp.xml_metadata.png" alt="IdP metadata file">



  - Example of filling "`Binding`" in the prefilled `sp.xml` metadata file

    <img src="pics/sp.xml_post-redirect.png" alt="sp.xml post-redirect">



- **C. Enter the following command in terminal to remove header & footer from cert**

  ```bash
  sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' certificate.crt | sed '/-BEGIN CERTIFICATE-/d' | sed '/-END CERTIFICATE-/d'
  ```

  <img src="pics/rm_header_footer.png" alt="Remove header and footer">




- **D. Copy output (manually) from terminal & paste into lines 19 & 26 in `sp.xml` file provided**

  <img src="pics/line19.png" alt="Line 19">
  <img src="pics/line26.png" alt="Line 26">



- **E. Configure "entityID" from provided sp.xml file in line 2**
  - Replace "https://app-name.schoolname.edu" in line 2 with your application's domain URL
    - Example: "https://tritongpt.ucsd.edu"
    - Replace this according to environment (dev, stage, and PROD)

  - For local instance, replace "https://app-name.schoolname.edu" with "localhost:Port#"
    - Example: "http://localhost:8080"

  <span style="color:red">Note: dev, stage & PROD use "https" secure protocol and local instance uses "http" insecure protocol</span>
- <span style="color:red">"http" transmits data in plain text (considered insecure) and should only be used for testing in local instances. "https" encrypts data and ensures privacy through SSL/TLS certificates, protecting data from interception and providing authentication of the server's identity.</span>



- **F. Verify SSO "`RequestedAttribute`" with I.T. Department or Identity and Access Management (IAM) team**
  - This SSO attribute is standard but verify that your organization accepts attribute(s) and uses "Active Directory" (`ad`) with the I.T. Department or Identity and Access Management (IAM) team. You can also add "`RequestedAttribute`" for email as follows:
  
  ```bash  
  NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:uri" isRequired="true" />
    <ns0:RequestedAttribute Name="urn:mace:schoolname.edu:sso:ad:email" 
  ```
<span style="color:red">Note: replace "schoolname" with school or organization name (I.T. Department or Identity and Access Management (IAM) team)</span> 

  Example:
<span style="color:blue">University of California San Diego</span> 
$\rightarrow$ 
<span style="color:blue">ucsd</span> 
$\rightarrow$
<span style="color:blue">urn:mace:ucsd.edu:sso:ad:username</span>


<span style="color:red">Note this is not guranteed to work and attribute should be verified with the I.T. Department or Identity and Access Management (IAM) team</span>

  <img src="pics/request_attribute.png" alt="Request Attribute">

### <span style="color:red">Algorithm & Troubleshooting Notes:</span>
  
The prefilled `sp.xml` file contains algorithms used in the original file for easy client implementation. If you would like to learn more about how to implement them on your own, please see the references below:

- If you find yourself having to troubleshoot algorithms, ensure that both the Identity Provider (IdP) and the Service Provider (SP) support the same algorithms for signing and digest methods.

- Another case is that an algorithm may have been deprecated or removed from current standards due to security vulnerabilities or advancements in cryptographic practices. If in this case, remove the deprecated algorithm or replace prefilled algorithms with a new algorithm.

- Common digest methods to ensure that the content of the message (e.g., assertion) has not been altered during transit: SHA-256, SHA-384, and SHA-512.

- Common signing methods to create digital signatures: dsa-sha256, ecdsa-sha256, ecdsa-sha384, ecdsa-sha512, rsa-sha256, rsa-sha384, and rsa-sha512.

#### <span style="color:red">Algorithm References:</span>

- [W3C XML Signature Syntax and Processing](https://www.w3.org/TR/xmldsig-core/#sec-Overview)

- [OASIS Security Services (SAML) TC](https://www.oasis-open.org/committees/tc_home.php?%20wg_abbrev=security#overview)

- [Python SAML Documentation](https://python-saml.readthedocs.io/en/latest/)

- [hashlib module on GitHub](https://github.com/python/cpython/blob/main/Lib/hashlib.py)

- [cryptography library on GitHub](https://github.com/pyca/cryptography/blob/main/src/cryptography/hazmat/primitives/asymmetric/rsa.py) 

    <span style="color:red">Final note on algorithms: if you decide to implement algorithms from scratch, `sp.xml` metadata files should be generated using scripts</span>
    


# <span style="color:blue">3) Configure `settings.json` file</span>
 

![pics](pics/empty_settings.png)



- **A. Enter the following command into the terminal to convert SP (app) `.cert` into string format**

  ```bash
  sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' certificate.crt | sed '/-BEGIN CERTIFICATE-/d' | sed '/-END CERTIFICATE-/d' | tr -d '\n'
  ```

    <span style="color:red">Note: This certificate was generated for demonstration purposes and is not an actual SP certificate</span>

  <img src="pics/cert_string.png" alt="Convert SP Cert">



- **B. Copy output (manually & exclude "%") from terminal & paste into line 18 in `settings.json`**

  <img src="pics/line18.png" alt="Line 18 in settings.json">



- **C. Copy cert from "IdP `sp.xml` metadata file" and save it in a `.txt` file named `IdP_cert.txt`**
    <span style="color:red"> Save `IdP_cert.txt` in the same directory</span>

  <img src="pics/IdP_certformate.png" alt="IdP Cert Format">
  <img src="pics/IdP_cert.png" alt="IdP Cert">



- **D. Enter the following command into the terminal to convert IdP `.cert` into string format**

  ```bash
  tr -d '\n ' < IdP_cert.txt
  ```

  <img src="pics/IdP_cert_format.png" alt="Convert IdP Cert">



- **E. Copy output (manually & exclude "%") from terminal & paste into line 10 in `settings.json`**

  <img src="pics/settings_IdP_cert.png" alt="Line 10 in settings.json">



- **F. Paste IdP `entityID` from "IdP `sp.xml` file" into line 5 in `settings.json` & paste SP `entityID` from apps `sp.xml` file into line 13**

    <span style="color:red"> Note: SP `entityID` should be according to environment: local, dev, stage, or PROD</span>
    
  <img src="pics/entityID.png" alt="EntityID">



- **G. Paste "`POST`" "`Binding`" from "IdP `sp.xml`" metadata file into line 8 & 16 in `settings.json` & IdP "Redirect" "`Location`" url into line 7 in `settings.json`**

  <img src="pics/binding_IdP.png" alt="Binding IdP">
  <img src="pics/binding.png" alt="Binding">



### Notes on IdP & SP:

- **IdP (Identity Provider):** IdP is responsible for authenticating users and asserting their identities to the Service Provider (SP). It manages user credentials and provides authentication services. When a user attempts to access a service, the IdP verifies the user's identity and issues a SAML assertion (a security token) to the SP. This assertion contains information about the user's identity and attributes, which the SP uses to grant access to the requested resource.

- **SP (Service Provider):** SP relies on the IdP to authenticate users and provide identity information. It hosts the application or service that users are trying to access. Upon receiving a SAML assertion from the IdP, the SP validates the assertion, extracts the user's identity and attributes, and uses this information to authorize access to the service. The SP does not handle user authentication directly; instead, it trusts the IdP to perform this task securely.



# <span style="color:blue">4) Configure `advanced_settings.json` file</span>

- **A. Request "SSO user attributes" from the IAM team or I.T. department**

  These attributes, specified in the `requestedAuthnContext` array as **Authentication Context Class References (AuthnContextClassRefs)**, represent different types of accounts that your SSO system will recognize and validate to ensure access to the application and the appropriate level of security based on the type of user account.

  Example:
  - "Active Directory" accounts for employees or staff
  - "Business Account" for business users
  - "Student Account" for student users
  - Other types of users that should have access to your application

    <span style="color:red"> Note the actual values should be obtained from the IAM team or I.T. department to ensure accuracy and security</span>
    
  <img src="pics/user_attrib.png" alt="advanced_settings.json file Example">



- **B. Fill out Technical, Support, and Organization Information**

  - Technical field should contain the IAM team or I.T. department contact information
  - Support field should contain the contact information of the developer or apps security team implementing SAML-SSO into the application

  Example:

  <img src="pics/support_fields.png" alt="Technical, Support, and Organization Information">



# <span style="color:blue">5) `settings.json` & `advanced_settings.json` file as Kubernetes secrets</span>

Kubernetes secrets is an object containing sensitive information

- **A. Creating Configuring Files**

  ```bash
  cd app_path/danswer/danswer-app
  mkdir saml_sso_config
  cd saml_sso_config
  touch settings.json
  touch advanced_settings.json
  ```

  - Manually copy and paste `settings.json` & `advanced_settings.json` from local file into Kubernetes application files using text editor `nano`, `vim`, or text editor preferred






- **B. Creating Kubernetes secrets**

  ```bash
  kubectl create secret -n environment_example generic kubetcl_secret \
      --from-file=./settings.json \
      --from-file=./advanced_settings.json
  ```

  <img src="pics/create_secret.png" alt="Creating Kubernetes Secrets">





- **C. Confirming Secrets**

  <img src="pics/confirm_secret1.png" alt="Confirming Secret 1">
  <img src="pics/confirm_secret2.png" alt="Confirming Secret 2">



- **D. Adding `extraMounts` & `extraVolumes` into `values.yaml` file in Helm Chart**

  ```yaml
  extraMounts:
    apiServer:
    - mountPath: /app_path/danswer/configs/saml_config
      name: sso_files

  extraVolumes:
    apiServer:
    - name: sso_files
      secret:
        secretName: sso_files
  ```

    <span style="color:red"> Note: the `secretName` must match the name given when creating secret, replace `kubetcl_secret` with `secretName` in `kubectl create secret` command from part 5.b</span>
  - <span style="color:red"> `name` in `extraMounts` & `extraVolumes` is an internal reference within Helm chart connecting volume mount to volume definition (pod specification) and should be consistent, but not required to match `secretName`</span>
 
 
 
- **E. Upgrading Helm Chart**

  ```bash
  cd ~
  cd helm_chart-path/
  git branch
  git pull
  helm upgrade -n environment_example danswer . -f values_file/filename.yaml
  ```

  <img src="pics/upgrading_helm.png" alt="Upgrading Helm Chart">





- **F. Extra Layer of Verification**
  - **i. Verifying Secrets**

    ```bash
    kubectl describe pod -n environment_example pod_name
    ```

    - Displays detailed information about the specified pod in the environment_example namespace to verify that the secret has been correctly mounted and configured

    <img src="pics/extra_veri.png" alt="Extra Verification">



  - **ii. Verifying Files**

    ```bash
    kubectl exec -n environment_example api-server-deployment-pod-example -i -t -- bash
    cd /extraMounts/mountPath
    ls
    ```

    <img src="pics/extra_veri2_mountPath.png" alt="Verifying Files in Mount Path">



# <span style="color:blue">6) Troubleshooting</span>

The main troubleshooting page is `saml.py` located in `backend/ee/danswer/server/`

- **A. Email Attribute Related Issues**
  - **i. Switching between `user_email` Definitions**
    - If resolved using `user_email = "@ucsd.edu"`, ensure the `RequestedAttribute` in the SP `sp.xml` metadata file is correctly configured for emails

![pic](pics/email.png)



- **B. libxml & xmlsec Dependency Errors**
  - **i.** Use the command `pip install --no-binary lxml lxml` to install `lxml` from source and link it to the same `libxml2` as `xmlsec` to avoid conflicting `libxml2` libraries
  
  
  - **References:**
    - [lxml Launchpad Bug](https://bugs.launchpad.net/lxml/+bug/1960668)
    - [lxml FAQ](https://lxml.de/FAQ.html#my-application-crashes)
    - [Python3 SAML Toolkit Readme](https://github.com/SAML-Toolkits/python3-saml?tab=readme-ov-file#note)


  - **ii.** Alternatively, as a <span style="color:red">Temporary Solution</span>, temporarily downgrade to `lxml==4.9.4` if you are using the following dependencies and encountering `lxml` and `xmlsec` matching errors:
    ```bash
    xmlsec 1.3.13
    lxml 5.1.0
    Python 3.11.7
    ```
    

# <span style="color:blue">7) Securely Storing Files</span>

- **A. Use a cloud-based password management service to store all `.cert`, `.keys`, `sp.xml`, `settings.json`, and `advanced_settings.json` files**

  Example: Using LastPass to store and share sensitive files with the team

    <span style="color:red">Include a secure note detailing support contact information and the expiration dates of `.cert` & `.key` files</span>
    - <span style="color:red">All other files containing a `.cert` & `.key` should expire on the same date as the `.cert` & `.key` expiration date</span>
    
![pic](pics/lp.png)

![pic](pics/lp_note.png)