In [3]:
import os
import requests
import pandas as pd

link = 'https://www.cms.gov/files/document/valid-icd-10-list.xlsx'
# Download the xlsx file
response = requests.get(link)

#check if the data folder exists, else create it
def check_and_create_folder(folder_path):
    if not os.path.exists(folder_path):
        os.makedirs(folder_path)
        print(f"Folder '{folder_path}' created.")
    else:
        print(f"Folder '{folder_path}' already exists.")
check_and_create_folder('./data')

# Save the downloaded file
with open('./data/icd10_codes.xlsx', 'wb') as file:
    file.write(response.content)

# Load the data into a DataFrame
icd_codes = pd.read_excel('./data/icd10_codes.xlsx')
print(icd_codes.head())

Folder './data' already exists.
    CODE            SHORT DESCRIPTION (VALID ICD-10 FY2025)  \
0   A000  Cholera due to Vibrio cholerae 01, biovar chol...   
1   A001    Cholera due to Vibrio cholerae 01, biovar eltor   
2   A009                               Cholera, unspecified   
3  A0100                         Typhoid fever, unspecified   
4  A0101                                 Typhoid meningitis   

              LONG DESCRIPTION (VALID ICD-10 FY2025) NF EXCL  
0  Cholera due to Vibrio cholerae 01, biovar chol...     NaN  
1    Cholera due to Vibrio cholerae 01, biovar eltor     NaN  
2                               Cholera, unspecified     NaN  
3                         Typhoid fever, unspecified     NaN  
4                                 Typhoid meningitis     NaN  


Speeding Up Embedding Generation:
Generating embeddings, especially for large datasets or real-time applications, can be a time-consuming process. Depending on the size of the data and the complexity of the model, embedding generation can be a bottleneck. Fortunately, several strategies can help speed up this process:

1. Batch Processing:
Batch processing involves grouping multiple inputs (texts, images, or data points) together and passing them through the embedding generation model in parallel. Instead of processing each input sequentially (which can be slow), we process multiple inputs simultaneously.

How it works: Modern deep learning frameworks like TensorFlow and PyTorch are optimized for batch operations. By using the model to process batches of inputs (e.g., a batch of 32 or 64 sentences), the process can be parallelized across the GPU, making the computation significantly faster than processing each sentence one by one.

Advantages:

Parallelization: Using batch processing allows models to take advantage of parallelism, utilizing multiple cores of the CPU or GPU efficiently.
Faster Data Loading: Batch processing often results in faster data loading times, as data for multiple instances is preloaded into memory.
Lower Latency: With the right batch size, models can reduce the overhead of repeatedly initializing the model for each input.

In [2]:
import os
import pandas as pd

2. Asynchronous Processing:
Asynchronous processing allows for parallel execution of tasks without waiting for each task to finish sequentially. This can significantly speed up the embedding generation process, especially when working with external APIs or I/O-bound operations (e.g., calling cloud services to generate embeddings).

How it works: The idea is to make non-blocking calls to the embedding generation process. While one request is being processed, the program can continue processing other requests. When the first request finishes, its result is processed, and the next one is picked up.

Advantages:

Non-blocking: Asynchronous methods allow for concurrent execution of multiple tasks, improving throughput.
Resource Efficiency: The system does not sit idle while waiting for one task to finish but instead performs other tasks, thus making full use of system resources.
Example using Python’s asyncio:

In [3]:
# import asyncio
# import aiohttp

# async def fetch_embedding(text):
#     url = "http://embedding_service/api"
#     async with aiohttp.ClientSession() as session:
#         async with session.post(url, json={"text": text}) as response:
#             return await response.json()

# async def generate_embeddings(texts):
#     tasks = [fetch_embedding(text) for text in texts]
#     embeddings = await asyncio.gather(*tasks)
#     return embeddings

# # Example usage
# texts = ["Hello, world!", "This is an example sentence.", "Batch processing speeds up embedding."]
# embeddings = asyncio.run(generate_embeddings(texts))


In [1]:
import asyncio
import aiohttp

# Fetch embedding function
async def fetch_embedding(text):
    url = "http://embedding_service/api"
    async with aiohttp.ClientSession() as session:
        async with session.post(url, json={"text": text}) as response:
            return await response.json()

# Function to generate embeddings
async def generate_embeddings(texts):
    tasks = [fetch_embedding(text) for text in texts]
    embeddings = await asyncio.gather(*tasks)
    return embeddings

# For running in Jupyter or already running event loops
def run_async_code(coro):
    try:
        loop = asyncio.get_running_loop()
    except RuntimeError:  # No running event loop
        loop = None

    if loop and loop.is_running():
        # If a loop is already running, use asyncio.ensure_future
        import nest_asyncio  # Helps with nested loops
        nest_asyncio.apply()
        return asyncio.ensure_future(coro)
    else:
        # Run normally with asyncio.run
        return asyncio.run(coro)

# Example usage
texts = ["Hello, world!", "This is an example sentence.", "Batch processing speeds up embedding."]
embeddings = run_async_code(generate_embeddings(texts))

# Print embeddings
print(embeddings)


<Task pending name='Task-1' coro=<generate_embeddings() running at C:\Users\komal.diwe\AppData\Local\Temp\ipykernel_7564\3493916112.py:12>>
