#### Proof of Concept
    -> URL scraping a GitHub repo files
    -> Processing and linking an openAI API to summarise

#### Step 1: Installing the required libraries using requirements.txt
    -> pip install -r ".\path\to\requirements.txt"

#### Step 2: Deciding scraping method
    -> We have two options here: 1) Using raw scraping 2) Using GitHub API

In [71]:
# Method 1 is using raw scarping
# First, we have to create an HTTP request to the URL we wanna fetch from

import requests
from IPython.display import display,Markdown

url = "https://raw.githubusercontent.com/Hamzamazhar1999/2D-TLM/refs/heads/main/README.md"

response = requests.get(url)

if response.status_code==200:
    readme_data = response.text
    display(Markdown(readme_data))
    # Issue number 1 is embedded links to local locations will not show up since 
    # we are only fetching the raw of the markdown, if there are href links then
    # we want to fetch them as well (this will complete the image/link missing
    # stuff)
else:
    print("Error fetching URL")

# 2D-TLM
Porting a 2D Transmission-Line Matrix Algorithm to CUDA

TLM or Transmission Line matrix method is a numerical technique that employs the time domain
to provide an approximation of the electromagnetic wave propagation. This approach uses a
cartesian matrix of nodes to depict the two-dimensional space where wave propagation and
scattering take place. It revolves around the discretization of the propagation of electromagnetic
waves in both time and space. This is an iterative procedure with the two steps of scattering and
connecting as its key components. TLM offers a highly instructive technique to use this algorithm
in computer simulations and wave propagation modeling.

The 2D CPU code provided as a reference for port to GPU has portions that could be very well
parallelized and optimized for runtime speedups and optimized application. The first thing that is
noticeable is the definition of compute extensive and repeating applications like the sqrt() function.

The most important parts that can be parallelized are the scatter and connect functions. In a
Transmission Line Matrix (TLM) simulation, the scatter function is used to update the grid of
nodes at each time step. Based on the neighbors’ values and the TLM equation's coefficients, it
calculates the new values for the nodes.

The grid has the dimensions 'Nx x Ny'. Each node in the grid has four different port voltages: 'V1',
'V2', 'V3', and 'V4'. The inner loop computes the updated values for the port voltages at each node
while the outer loop iterates across the rows and columns of the grid. The port voltages
and impedance 'Z' are used in calculation of the current 'I' flowing through the node. The current
and impedances are then used to update the port voltages.
The next part of the code that can be parallelized is the connect function. The structure of the
simulated network is set using this function. It describes the connections between the nodes of the
grids. The coefficients of the TLM equation are set up using the connect functions and these
ultimately lead to the update of the node values.

# Changes made in the GPU kernel:

Initially, the connect function is used to exchange the V2 and V4 port voltage values between the
grid nodes. Then the exchange of the voltages at the V1 and V3 port happens. These transactions
establish the connection of the grid nodes.

The nodes along the grid's edges are subject to boundary conditions according to the boundary
function. It multiplies the V3 and V1 port voltages for the nodes at the top and bottom borders of
the grid by the corresponding boundary reflection coefficients, rYmax and rYmin. It multiplies the
V4 and V2 port voltages for the nodes at the left and right edges of the grid by the corresponding
boundary reflection coefficients, rXmax and rXmin.

Other than these two functions, the source and the output probing functions could also be done
using CUDA kernels to ensure a seamless data allocation in device without having to copy and
allocate new memory locations after the updates.

Furthermore, there are some inconsistencies in the code that could be dealt with by using simple
coding techniques to properly optimize memory allocation and memory access. These include
dynamic allocation of Ein[] and Eout[] arrays to allow easy memory copying from host to device
for further processing and evaluating the output

# Improvements from GPU Use:
![image](https://github.com/Hamzamazhar1999/2D-TLM/assets/129704102/75fcacc2-c3d9-41b9-aeb6-3f412f903fd9)




#### Step 3: Deciding scraping method
    -> If using raw scraping, and gotten MD file, import transformers for summarization
    -> Or other model APIs for contextual summarization 

In [72]:
# from transformers import pipeline

# A couple of issues with the current model pipeline from transformers is 
# that the model does not perform well for tokens over 1024, but long 
# md files will have a higher number of tokens, summarization of md 
# files is not done well by this naive model, plus the md files have
# excessive formatting which confuses the model
# Instead of using Summarization, I want to use another model to ensure correct summarization,
# and that also allows for a higher number of tokens to be passed through  
# summarizer = pipeline("summarization")

# summary = summarizer(readme_data[:100], max_length=40)
# print(summary[0])



In [73]:
from openai import OpenAI
from apikey import gpt_api

prompt_template = lambda readme_data: f"""
You are a professional technical writer and summarization expert specializing in transforming complex documentation into concise, clear, and well-structured summaries. Your goal is to analyze the provided README file from a repository and generate a professional summary.

### Guidelines:
1. **Relevance**:  
   - Focus on the **core purpose** of the repository.  
   - Highlight key **features**, **functionalities**, and **intended use cases**.  
   - Identify the **target audience** or **users** of the repository.

2. **Clarity**:  
   - Use **simple and precise language** to explain the repository’s purpose and usage.  
   - Avoid overly technical jargon unless necessary for understanding.  

3. **Organization**:  
   - Structure the summary with headings if appropriate (e.g., "Overview", "Key Features", "Usage").  
   - Ensure the output is **well-formatted** and easy to read.  

4. **Formatting**:  
   - Output the summary in **Markdown format**.  

5. **Important**:
   - Make sure to not imagine and stay to the point, if the ReadMe.md file is small, keep the summary small.
---

### Input:
- **README file (Markdown format)**:  
{readme_data}

---

### Output:  
1. **Summary**:  
   - A **concise and well-structured summary** of the repository.  
   - Highlights the repository’s **purpose**, **features**, and **usage**.  
   - Organized with headings for clarity and readability.

"""


In [74]:
prompt = prompt_template(readme_data)
# print(prompt)

In [75]:

# setup api client
client = OpenAI(api_key=gpt_api)

# make api call
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Expert software engineer summarization for files"},
        {"role": "user", "content": prompt}
    ], 
    temperature = 0.7
)

# extract response
response_string = response.choices[0].message.content
# print(response_string)

In [76]:
display(Markdown(response_string))

# 2D-TLM Repository Summary

## Overview
The **2D-TLM** repository focuses on porting a **2D Transmission-Line Matrix (TLM)** algorithm to **CUDA**. TLM is a numerical method utilized for approximating electromagnetic wave propagation in two-dimensional spaces through a Cartesian matrix of nodes. The algorithm's iterative process involves scattering and connecting nodes to simulate wave behavior effectively.

## Key Features
- **TLM Methodology**: Implements a time-domain approach for modeling electromagnetic wave propagation.
- **CUDA Optimization**: Transforms existing CPU code to leverage GPU parallelization, aiming for enhanced runtime performance.
- **Core Functions**:
  - **Scatter Function**: Updates node values based on neighboring nodes and TLM coefficients.
  - **Connect Function**: Establishes connections between grid nodes and sets up TLM coefficients for node value updates.
- **Boundary Conditions Handling**: Applies specific reflection coefficients at grid edges to simulate realistic physical conditions.
- **Memory Optimization**: Introduces dynamic memory allocation strategies to improve data handling and processing efficiency on GPUs.

## Intended Use Cases
This repository is designed for researchers and developers interested in:
- Simulating electromagnetic wave behaviors using TLM methods.
- Enhancing computational efficiency through GPU programming.
- Exploring numerical modeling techniques in electromagnetic theory.

By providing both the CUDA-optimized code and a reference CPU implementation, users can gain insights into the performance benefits of GPU computing with TLM simulations.

In [77]:
# Call notion api to convert the markdown file to a notion document
