# EvoDiff deployment on Azure AI Foundry
This notebook demonstrates how to invoke the EvoDiff endpoint using Python on Azure AI Foundry.

## Prerequisites

Before you can successfully invoke the endpoint, you need two key pieces of information that you must define in the code cells below:

1.  **Endpoint URL**: This is the specific URL for your deployed EvoDiff model. You will need to replace the placeholder value in the `ENDPOINT_URL` variable in the code cell under 'Endpoint URL and API Key'.
2.  **API Key**: This is the secret key required to authenticate with your endpoint. You will need to replace the placeholder value in the `API_KEY` variable in the code cell under 'Endpoint URL and API Key'.

Once you have defined your endpoint URL and API key, this notebook will guide you through how to call the `generate` function with this input to get predictions from the model.

In [None]:
import json
import time
import urllib.request
from dataclasses import dataclass, asdict

## Endpoint URL and API Key

**Important:** Replace the placeholder values below with your actual Endpoint URL and API Key.

In [None]:
# Replace with your actual endpoint URL and API key
ENDPOINT_URL = "YOUR_ENDPOINT_URL_HERE"  # e.g., "https://my-model.ukeast.inference.ml.azure.com/score"
API_KEY = "YOUR_API_KEY_HERE"

## Generate Function
The call to the endpoint is wrapped in a funtion `generate`.

In [None]:
def generate(payload: dict[str, any]) -> dict[str, any]:
    body = str.encode(json.dumps(payload))

    headers = {"Content-Type": "application/json", "Authorization": ("Bearer " + API_KEY)}

    req = urllib.request.Request(ENDPOINT_URL, body, headers)

    try:
        response = urllib.request.urlopen(req)
        result = json.loads(response.read())
        return result
    except urllib.error.HTTPError as error:
        print("The request failed with status code: " + str(error.code))
        print(error.info())
        print(error.read().decode("utf8", "ignore"))
        raise error

## Simplified Input Dataclass
The paremeters of the endpoint is wrapped in a dataclass to simplify the payload in the request.

In [None]:
@dataclass
class EvoDiffInput:
    sequence: str
    count: int = 1
    nonstandard_aas: bool = False
    sampling_t: float = 1.0
    repeat_penalty: float | None = None

    def to_payload(self) -> dict[str, any]:
        data = asdict(self)
        # Remove fields with None values, as the endpoint might not expect them or handle them as default if not present
        filtered_data = {k: v for k, v in data.items() if v is not None}
        return {"input_data": {"columns": list(filtered_data.keys()), "index": [0], "data": [list(filtered_data.values())]}}
    

def run_evodiff(
        sequence: str, 
        count: int =1, 
        nonstandard_aas: bool =False, 
        sampling_t: float =1.0, 
        repeat_penalty: float =None):
    
    # Create input object and convert to payload
    input_obj = EvoDiffInput(
        sequence=sequence, 
        count=count,
        nonstandard_aas=nonstandard_aas,
        sampling_t=sampling_t,
        repeat_penalty=repeat_penalty
    )
    input_payload = input_obj.to_payload()
    
    # Print input payload for reference
    print(f"Input Payload: {json.dumps(input_payload, indent=2)}")
    
    # Call the model and measure time
    start_ts = time.time()
    result = generate(input_payload)
    time_taken = time.time() - start_ts
    
    # Print results
    print(f"Result: {json.dumps(result, indent=2)}")
    print(f"Time taken: {time_taken:.2f} seconds")
    
    return result

## Example Usage

### Unconditional generation
Use EvoDiff to generate 5 sequences of length 100, unconditionally

In [None]:
sequence = "####################################################################################################" # Input sequence with '#' as mask tokens to be filled. 

result = run_evodiff(sequence, count=5)

### Conditional generation: scaffolding the calcium binding site of calmodulin 
Use EvoDiff to generate 10 scaffolds, 100 residues in length, for the calcium binding motifs in 1PRW 

Binding motifs: 

**Residues** 16-35 (FSLFDKDGDGTITTKELGTV)

**Residues** 52-71 (INEVDADGNGTIDFPEFLTM)

In [None]:
sequence = "########################FSLFDKDGDGTITTKELGTV###############################INEVDADGNGTIDFPEFLTM#############################################" # Input sequence with '#' as mask tokens to be filled. Use standard amino acid letters for fixed positions.

result = run_evodiff(sequence, count=10)

### Additional parameters

We also make available additional parameters; 

* `nonstandard_aas`. Set `True` to include non-standard amino acids in sampling. Default: False (only standard 20 AAs). Enable only for specialized applications.

* `sampling_t` By default `sampling_t=1.0`. Temperature for sampling: higher values (>1.0) increase diversity, lower values (<1.0) increase conservatism. Default: `1.0`

* `repeat_penalty` Penalty to reduce adjacent amino acid repeats. Recommended values: 1.2-2.0. Default: `None` (no penalty). Higher values more aggressively prevent repeats.


In [None]:
sequence = "####################################################################################################" # Input sequence with '#' as mask tokens to be filled. 
count = 5 # number of sequences to generate

result = run_evodiff(sequence=sequence, count=count, nonstandard_aas=True, sampling_t=0.5, repeat_penalty=1.2)

### Note

-   Replace placeholder values (endpoint URL, payload) with your actual data.
-   Consult the documentation for your specific Evodiff endpoint to understand the expected request payload format and the structure of the response.