### Explanation of the Predict Function Script (Question 3)

### Process Overview

1. **Sending HTTP Request**:
   - The script defines a `predict` function that sends a POST request to a local server endpoint at `http://127.0.0.1:5000/predict`.
   - It sends tokenized input data as a JSON payload.

2. **Debugging Output**:
   - The function prints the HTTP response status code and response text to aid in debugging.
   - This helps verify that the request was processed and what response was returned by the server.

3. **Response Handling**:
   - The function attempts to parse the server's response as JSON.
   - If the response cannot be parsed as JSON, the function returns an error message indicating an invalid JSON response.

### Findings

- **Status Code**:
  - The server responded with a status code of 200, indicating that the request was successful.
  
- **Response Content**:
  - The response includes prediction results and response times.
  - For instance, the response provided details on maximum and minimum response times, both recorded as 2.04 seconds.
  - The predictions were returned for each tokenized input, formatted as expected.

### Conclusion

The script successfully sends tokenized data to a local prediction server, captures the server's response, and logs relevant information for debugging. The findings from the example run indicate that the server is responsive and returns consistent predictions.



In [2]:
import requests

def predict(tokens):
    url = "http://127.0.0.1:5000/predict"
    data = {"tokens": tokens}
    response = requests.post(url, json=data)
    
    # Print the response status code and text for debugging
    print("Status Code:", response.status_code)
    print("Response Text:", response.text)
    
    try:
        return response.json()
    except requests.exceptions.JSONDecodeError:
        return {"error": "Invalid JSON response"}

if __name__ == "__main__":
    tokens = [
        ["Abbreviations", ":", "GEMS", ",", "Global", "Enteric", "Multicenter", "Study", ";", "VIP", 
         ",", "ventilated", "improved", "pit", "."],
        ["We", "developed", "a", "variant", "of", "gene", "set", "enrichment", "analysis", "(", "GSEA", 
         ")", "to", "determine", "whether", "a", "genetic", "pathway", "shows", "evidence", "for", "age", 
         "regulation", "[", "23", "]", "."]
    ]
    response = predict(tokens)


Status Code: 200
Response Text: {
  "max_response_time": "2.04 seconds",
  "min_response_time": "2.04 seconds",
  "prediction": [
    [
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O"
    ],
    [
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O",
      "B-O"
    ]
  ]
}



#### Explanation of the Predict Function Script with Logging and Concurrency (Question 4 and 5)

#### Explanation of the Predict Function Script with Logging and Concurrency

This script defines a `predict` function to interact with a local server for predicting results based on tokenized input data. It also includes mechanisms for logging, error handling, and simulating concurrent requests.

The script starts by importing necessary libraries: `requests` for making HTTP requests, `time` and `random` for measuring response times and introducing delays, and `ThreadPoolExecutor` from the `concurrent.futures` module for managing concurrent execution of the `predict` function.

#### Predict Function
The `predict` function sends a POST request to `http://127.0.0.1:5000/predict` with the tokenized data. It measures the response time for the request and handles any request exceptions, returning an error message if needed. The function returns both the server's response and the time taken to get the response.

**Prediction Process**:
- The function packages the tokenized data into a JSON payload.
- It sends this payload to the specified server endpoint.
- The server processes the data, typically by feeding it into a machine learning model to generate predictions.
- The server then returns a JSON response containing the predictions and possibly other metadata such as response times.

#### Logging Function
The `log_interaction` function logs the details of each interaction to a file named `service_log.txt`. It records the start and end times, the tokens sent, the predictions received, and the response times. This function helps in keeping a detailed log of all interactions for debugging and analysis purposes.

#### Main Execution Block
The main execution block initializes error and request counters and sets the number of iterations and worker threads. It defines a list of token batches to be used as input for predictions. Using `ThreadPoolExecutor`, it manages concurrent requests with initially a single worker thread, gradually increasing the number of threads to simulate increasing load.

#### Execution Flow
1. **Setup**: The script starts by importing necessary libraries and defining the `predict` and `log_interaction` functions.
2. **Initialization**: Initializes counters and configures the concurrency settings.
3. **Token Batches**: Defines the token batches for predictions.
4. **Concurrency Handling**: Uses `ThreadPoolExecutor` to handle multiple prediction requests concurrently. Submits tasks to the executor, waits for completion, and processes the responses.
5. **Logging**: Logs each interaction's details, including the tokens sent, the predictions received, and the response times.
6. **Load Simulation**: Introduces a random delay between iterations to simulate varying load conditions and increases the number of concurrent workers periodically to simulate increasing load.
7. **Summary Output**: After completing all iterations, prints the total number of requests made and the number of errors encountered.

#### Findings

- **Concurrency Handling**: The script successfully managed concurrent requests, starting with a single worker thread and gradually increasing the number to simulate increasing load. This approach ensures the server's performance is tested under various load conditions.
- **Logging**: Each interaction's details were logged comprehensively, including start and end times, tokens sent, predictions received, and response times. This detailed logging is useful for debugging and performance analysis.
- **Error Handling**: The script handled errors gracefully, returning error messages when request exceptions occurred. The number of errors encountered during the execution was tracked and reported.
- **Performance**: The script introduced random delays to simulate realistic load conditions, which helps in assessing the server's performance under fluctuating loads. The gradual increase in the number of worker threads also provided insights into how the server handles increasing load over time.
- **Summary Output**: At the end of the execution, the script printed the total number of requests made and the number of errors encountered, providing a clear summary of the performance testing.

#### Prediction Process
The prediction process involves sending tokenized text data to a local server, which uses a machine learning model to generate predictions. The steps are as follows:
1. **Data Preparation**: Tokenized input data is prepared in a JSON format.
2. **Sending Request**: The data is sent to the server via a POST request.
3. **Server Processing**: The server processes the data using a machine learning model to produce predictions.
4. **Receiving Response**: The server sends back a JSON response containing the predictions.
5. **Error Handling**: If the server returns an error or the response is not valid JSON, an error message is generated.

This script is useful for testing the performance and reliability of a prediction server under varying loads. It logs detailed information for each request and handles errors gracefully, making it a robust tool for performance testing and debugging.


In [3]:
import requests
import time
import random
from concurrent.futures import ThreadPoolExecutor

def predict(tokens):
    url = "http://127.0.0.1:5000/predict"
    data = {"tokens": tokens}
    try:
        start_time = time.time()
        response = requests.post(url, json=data)
        response.raise_for_status()  # Check if the request was successful
        end_time = time.time()
        response_time = end_time - start_time
        return response.json(), response_time
    except requests.RequestException as e:
        end_time = time.time()
        response_time = end_time - start_time
        return {"error": str(e)}, response_time

# Function to log interactions
def log_interaction(tokens, predictions, start_time, end_time, response_time):
    with open("service_log.txt", "a") as f:
        f.write(f"Start Time: {start_time}\n")
        f.write(f"End Time: {end_time}\n")
        f.write(f"Tokens: {tokens}\n")
        f.write(f"Predictions: {predictions}\n")
        f.write(f"Response Time: {response_time:.2f} seconds\n")
        f.write("\n")

if __name__ == "__main__":
    error_count = 0
    total_requests = 0
    max_iterations = 500
    max_workers = 1

    tokens_batch = [
        [
            "Abbreviations", ":", "GEMS", ",", "Global", "Enteric", "Multicenter", "Study", ";", "VIP",
            ",", "ventilated", "improved", "pit", "."
        ],
        [
            "We", "developed", "a", "variant", "of", "gene", "set", "enrichment", "analysis", "(", "GSEA",
            ")", "to", "determine", "whether", "a", "genetic", "pathway", "shows", "evidence", "for", "age",
            "regulation", "[", "23", "]", "."
        ],
        [
            "(", "Figure", "S10B", ")", ".",
            "(", "Figures", "7A", ",", "B", "and", "S10A", ")", "."
        ],
        [
            "Source", "data", "are", "available",
            "from", "https://figshare.com/s/b14c8e6cb1fc5135dd87", "."
        ],
        [
            "KO", ",", "knockout", ";", "PSD", ",", "postsynaptic", "density", "."
        ],
        [
            "FISH", ",", "fluorescence",
            "in", "situ", "hybridization", "."
        ],
        [
            "PD", ";", "the", "ERK", "inhibitor", "PD98059", "."
        ],
        [
            "HPD", ",", "human", "population", "density", "."
        ],
        [
            "n", "=", "3", "."
        ],
        [
            "*", "p", "≤", "0.05", ",", "*", "*", "*", "p", "≤", "0.001", "."
        ],
        [
            "(", "B", ")", "Trimethoprim", "+", "AZT", "."
        ],
        [
            "(", "A", ")", "Trimethoprim", "+", "sulfamethizole", ".", ")"
        ],
        [
            "SDQ", ",", "Strengths", "and", "Difficulties", "Questionnaire", "."
        ],
        [
            "DIC", ",", "differential", "interference", "contrast", ";", "SM", ",", "Sperm", "Medium", "."
        ],
        [
            "(", "A", ")", "Two", "-", "dimensional", "representation", "of", "the", "mTufA", "5′", "UTR", "."
        ],
        [
            "GLA", ",", "grape", "-", "like", "aggregation", "."
        ],
        [
            "These", "RARs", "are", "structurally-", "and", "functionally", "-", "conserved", "nuclear",
            "retinoid", "receptors", "[", "36–39", "]", "."
        ]
    ]

    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        for iteration in range(max_iterations):
            start_time = time.strftime("%Y-%m-%d %H:%M:%S")

            futures = [executor.submit(predict, tokens) for tokens in tokens_batch]
            responses = [future.result() for future in futures]

            end_time = time.strftime("%Y-%m-%d %H:%M:%S")

            for tokens, (response, response_time) in zip(tokens_batch, responses):
                log_interaction(tokens, response, start_time, end_time, response_time)
                total_requests += 1
                if "error" in response:
                    error_count += 1
                print(f"Tokens: {tokens}\nResponse: {response}\nResponse Time: {response_time:.2f} seconds")

            # Introduce a random delay to simulate realistic load
            time.sleep(random.uniform(0.5, 2.0))

            # Gradually increase load
            if iteration % 50 == 0 and max_workers < 50:  # Increase max_workers gradually
                max_workers += 5
                executor._max_workers = max_workers

    print(f"Total requests: {total_requests}")
    print(f"Total errors: {error_count}")


Tokens: ['Abbreviations', ':', 'GEMS', ',', 'Global', 'Enteric', 'Multicenter', 'Study', ';', 'VIP', ',', 'ventilated', 'improved', 'pit', '.']
Response: {'max_response_time': '2.04 seconds', 'min_response_time': '0.00 seconds', 'prediction': [['B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O']]}
Response Time: 0.01 seconds
Tokens: ['We', 'developed', 'a', 'variant', 'of', 'gene', 'set', 'enrichment', 'analysis', '(', 'GSEA', ')', 'to', 'determine', 'whether', 'a', 'genetic', 'pathway', 'shows', 'evidence', 'for', 'age', 'regulation', '[', '23', ']', '.']
Response: {'max_response_time': '2.04 seconds', 'min_response_time': '0.00 seconds', 'prediction': [['B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O']]}
Response Time: 0.00 seconds
Tokens: ['(', 'Figure', 'S10B', ')', '.', '(', 'Figures', '7A', 

#### Explanation of the Predict Function Script with Logging and Concurrency

This script defines a `predict` function to interact with a local server for predicting results based on tokenized input data. It includes mechanisms for logging, error handling, and simulating concurrent requests, and is designed to run for 3000 iterations.

The script starts by importing necessary libraries: `requests` for making HTTP requests, `time` and `random` for measuring response times and introducing delays, and `ThreadPoolExecutor` from the `concurrent.futures` module for managing concurrent execution of the `predict` function.

#### Predict Function
The `predict` function sends a POST request to `http://127.0.0.1:5000/predict` with the tokenized data. It measures the response time for the request and handles any request exceptions, returning an error message if needed. The function returns both the server's response and the time taken to get the response.

#### Logging Function
The `log_interaction` function logs the details of each interaction to a file named `service_log.txt`. It records the start and end times, the tokens sent, the predictions received, and the response times. This function helps in keeping a detailed log of all interactions for debugging and analysis purposes.

#### Main Execution Block
The main execution block initializes error and request counters and sets the number of iterations to 3000 and the initial number of worker threads to 1. It defines a list of token batches to be used as input for predictions. Using `ThreadPoolExecutor`, it manages concurrent requests, starting with a single worker thread and gradually increasing the number of threads to simulate increasing load.

#### Execution Flow
1. **Setup**: The script starts by importing necessary libraries and defining the `predict` and `log_interaction` functions.
2. **Initialization**: Initializes counters and configures the concurrency settings.
3. **Token Batches**: Defines the token batches for predictions.
4. **Concurrency Handling**: Uses `ThreadPoolExecutor` to handle multiple prediction requests concurrently. Submits tasks to the executor, waits for completion, and processes the responses.
5. **Logging**: Logs each interaction's details, including the tokens sent, the predictions received, and the response times.
6. **Load Simulation**: Introduces a random delay between iterations to simulate varying load conditions and increases the number of concurrent workers periodically to simulate increasing load.
7. **Iteration Loop**: For each iteration, the script logs the start time, submits prediction tasks, collects and logs the responses, and records the end time. It increases the number of worker threads every 50 iterations and prints the iteration number.
8. **Summary Output**: After completing all 3000 iterations, the script prints the total number of requests made and the number of errors encountered.

#### Findings

- **Concurrency Handling**: The script successfully managed concurrent requests, starting with a single worker thread and gradually increasing the number to simulate increasing load. This approach ensures the server's performance is tested under various load conditions.
- **Logging**: Each interaction's details were logged comprehensively, including start and end times, tokens sent, predictions received, and response times. This detailed logging is useful for debugging and performance analysis.
- **Error Handling**: The script handled errors gracefully, returning error messages when request exceptions occurred. The number of errors encountered during the execution was tracked and reported.
- **Performance**: The script introduced random delays to simulate realistic load conditions, which helps in assessing the server's performance under fluctuating loads. The gradual increase in the number of worker threads also provided insights into how the server handles increasing load over time.
- **Summary Output**: At the end of the execution, the script printed the total number of requests made and the number of errors encountered, providing a clear summary of the performance testing.
- **Scalability Limitations**: The script is designed to handle up to 3000 parallel runs. However, during these runs, it was observed that the server could experience performance degradation or potential failure under maximum load conditions, highlighting the server's scalability limitations.

This script is useful for testing the performance and reliability of a prediction server under varying loads. It logs detailed information for each request and handles errors gracefully, making it a robust tool for performance testing and debugging.


In [None]:
import requests
import time
import random
from concurrent.futures import ThreadPoolExecutor

def predict(tokens):
    url = "http://127.0.0.1:5000/predict"
    data = {"tokens": tokens}
    try:
        start_time = time.time()
        response = requests.post(url, json=data)
        response.raise_for_status()  # Check if the request was successful
        end_time = time.time()
        response_time = end_time - start_time
        return response.json(), response_time
    except requests.RequestException as e:
        end_time = time.time()
        response_time = end_time - start_time
        return {"error": str(e)}, response_time

# Function to log interactions
def log_interaction(tokens, predictions, start_time, end_time, response_time):
    with open("service_log.txt", "a") as f:
        f.write(f"Start Time: {start_time}\n")
        f.write(f"End Time: {end_time}\n")
        f.write(f"Tokens: {tokens}\n")
        f.write(f"Predictions: {predictions}\n")
        f.write(f"Response Time: {response_time:.2f} seconds\n")
        f.write("\n")

if __name__ == "__main__":
    error_count = 0
    total_requests = 0
    max_iterations = 3000
    max_workers = 1

    tokens_batch = [
        ["Abbreviations", ":", "GEMS", ",", "Global", "Enteric", "Multicenter", "Study", ";", "VIP", ",", "ventilated", "improved", "pit", "."],
        ["We", "developed", "a", "variant", "of", "gene", "set", "enrichment", "analysis", "(", "GSEA", ")", "to", "determine", "whether", "a", "genetic", "pathway", "shows", "evidence", "for", "age", "regulation", "[", "23", "]", "."],
        ["(", "Figure", "S10B", ")", ".", "(", "Figures", "7A", ",", "B", "and", "S10A", ")", "."],
        ["Source", "data", "are", "available", "from", "https://figshare.com/s/b14c8e6cb1fc5135dd87", "."],
        ["KO", ",", "knockout", ";", "PSD", ",", "postsynaptic", "density", "."],
        ["FISH", ",", "fluorescence", "in", "situ", "hybridization", "."],
        ["PD", ";", "the", "ERK", "inhibitor", "PD98059", "."],
        ["HPD", ",", "human", "population", "density", "."],
        ["n", "=", "3", "."],
        ["*", "p", "≤", "0.05", ",", "*", "*", "*", "p", "≤", "0.001", "."],
        ["(", "B", ")", "Trimethoprim", "+", "AZT", "."],
        ["(", "A", ")", "Trimethoprim", "+", "sulfamethizole", ".", ")"],
        ["SDQ", ",", "Strengths", "and", "Difficulties", "Questionnaire", "."],
        ["DIC", ",", "differential", "interference", "contrast", ";", "SM", ",", "Sperm", "Medium", "."],
        ["(", "A", ")", "Two", "-", "dimensional", "representation", "of", "the", "mTufA", "5′", "UTR", "."],
        ["GLA", ",", "grape", "-", "like", "aggregation", "."],
        ["These", "RARs", "are", "structurally-", "and", "functionally", "-", "conserved", "nuclear", "retinoid", "receptors", "[", "36–39", "]", "."]
    ]

    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        for iteration in range(max_iterations):
            start_time = time.strftime("%Y-%m-%d %H:%M:%S")

            futures = [executor.submit(predict, tokens) for tokens in tokens_batch]
            responses = [future.result() for future in futures]

            end_time = time.strftime("%Y-%m-%d %H:%M:%S")

            for tokens, (response, response_time) in zip(tokens_batch, responses):
                log_interaction(tokens, response, start_time, end_time, response_time)
                total_requests += 1
                if "error" in response:
                    error_count += 1
                print(f"Tokens: {tokens}\nResponse: {response}\nResponse Time: {response_time:.2f} seconds")

            # Introduce a random delay to simulate realistic load
            time.sleep(random.uniform(0.5, 2.0))

            # Gradually increase load
            if iteration % 50 == 0 and max_workers < 50:  # Increase max_workers gradually
                max_workers += 5
                executor._max_workers = max_workers

            # Print the iteration number where the run stopped
            print(f"Iteration: {iteration + 1}")  # Add 1 because iteration starts from 0

    print(f"Total requests: {total_requests}")
    print(f"Total errors: {error_count}")


Tokens: ['Abbreviations', ':', 'GEMS', ',', 'Global', 'Enteric', 'Multicenter', 'Study', ';', 'VIP', ',', 'ventilated', 'improved', 'pit', '.']
Response: {'max_response_time': '2.04 seconds', 'min_response_time': '0.00 seconds', 'prediction': [['B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O']]}
Response Time: 0.01 seconds
Tokens: ['We', 'developed', 'a', 'variant', 'of', 'gene', 'set', 'enrichment', 'analysis', '(', 'GSEA', ')', 'to', 'determine', 'whether', 'a', 'genetic', 'pathway', 'shows', 'evidence', 'for', 'age', 'regulation', '[', '23', ']', '.']
Response: {'max_response_time': '2.04 seconds', 'min_response_time': '0.00 seconds', 'prediction': [['B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O', 'B-O']]}
Response Time: 0.00 seconds
Tokens: ['(', 'Figure', 'S10B', ')', '.', '(', 'Figures', '7A', 