# Leveraging LLMs for Code Analysis

In [1]:
%load_ext jupyter_ai

In [3]:
%%ai ollama:deepseek-coder-v2

How do I create an array in using Python?

 To create an array in Python, you can use the `array` module or simply list. Here’s how to create and print an array (or a list) in both formats:

```python
import array as arr

# Creating an array of integers
int_array = arr.array('i', [1, 2, 3, 4, 5])  # 'i' denotes integer type

# Printing the array
print(int_array)

# Alternatively, using a list
python_list = [1, 2, 3, 4, 5]

# Printing the list
print(python_list)
```

Output:
```
array('i', [1, 2, 3, 4, 5])
[1, 2, 3, 4, 5]
```

In [4]:
import array as arr


# Creating an array of integers
int_array = arr.array('i', [1, 2, 3, 4, 5])  # 'i' denotes integer type

# Printing the array
print(int_array)

# Alternatively, using a list
python_list = [1, 2, 3, 4, 5]

# Printing the list
print(python_list)

array('i', [1, 2, 3, 4, 5])
[1, 2, 3, 4, 5]


In [6]:
%%ai ollama:deepseek-coder-v2

Your are an expert Python programmer. Explain the following Python code delimited by 
``` to a more junior developer.

```
import array as arr


# Creating an array of integers
int_array = arr.array('i', [1, 2, 3, 4, 5])  # 'i' denotes integer type

# Printing the array
print(int_array)

# Alternatively, using a list
python_list = [1, 2, 3, 4, 5]

# Printing the list
print(python_list)
```

 Sure! Let's break down this Python code for a more junior developer:

### Code Explanation

1. **Importing the Array Module**
   ```python
   import array as arr
   ```
   - Here, we are importing the `array` module and giving it an alias `arr`. This allows us to use functions or classes from this module more conveniently later in our code.

2. **Creating an Array of Integers**
   ```python
   int_array = arr.array('i', [1, 2, 3, 4, 5])  # 'i' denotes integer type
   ```
   - We are creating an array specifically for integers (`'i'` stands for integer).
   - The `arr.array('i', [1, 2, 3, 4, 5])` part means we are initializing an array with the name `int_array`, and it contains the elements `[1, 2, 3, 4, 5]`.

3. **Printing the Array**
   ```python
   print(int_array)
   ```
   - This line prints the contents of the array to the console. The output will be: `array('i', [1, 2, 3, 4, 5])`.

4. **Creating a List Using Python's Built-in Syntax**
   ```python
   python_list = [1, 2, 3, 4, 5]
   ```
   - Here, we are creating a list named `python_list` that contains the same integers as our array. The syntax for lists in Python is square brackets `[]`.

5. **Printing the List**
   ```python
   print(python_list)
   ```
   - This line prints the contents of the list to the console. The output will be: `[1, 2, 3, 4, 5]`.

### Summary
- We used the `array` module to create an array of integers and a Python list. Both are collections that can hold multiple items.
- Arrays in Python's `array` module are more specific about the type of elements they contain (e.g., only integers), while lists are more flexible and can contain any data types.
- We printed both the array and the list to verify their contents.

### Output
When you run this code, it will produce the following output:
```
array('i', [1, 2, 3, 4, 5])
[1, 2, 3, 4, 5]
```

This should give a clear understanding of how arrays and lists are created and used in Python.

In [7]:
%%ai ollama:deepseek-coder-v2

What happens if I try to access data in an empty Python array?

 ```python
# Create an empty list
my_array = []

# Attempt to access the first element of the array
try:
    first_element = my_array[0]
except IndexError as e:
    print(f"An error occurred: {e}")
```

Output: 
```
An error occurred: list index out of range
```

## Linked Lists

### Exercise

Ask your LLM to...

1. Write some code to implement a linked list in Python.
2. Explain the downsides and overheads associated with linked lists compared to other data structures.


In [2]:
%%ai ollama:deepseek-coder-v2

Write some code to implement a linked list in Python.


 Here's an example of how you can implement a singly linked list in Python, along with a simple function to print the elements of the linked list:

```python
class Node:
    def __init__(self, data):
        self.data = data
        self.next = None

class LinkedList:
    def __init__(self):
        self.head = None

    def append(self, data):
        new_node = Node(data)
        if not self.head:
            self.head = new_node
        else:
            current = self.head
            while current.next:
                current = current.next
            current.next = new_node

    def prepend(self, data):
        new_node = Node(data)
        new_node.next = self.head
        self.head = new_node

    def delete(self, key):
        current = self.head
        previous = None
        while current and current.data != key:
            previous = current
            current = current.next
        if not current:
            return
        if not previous:
            self.head = current.next
        else:
            previous.next = current.next

    def search(self, key):
        current = self.head
        while current and current.data != key:
            current = current.next
        return bool(current)

    def print_list(self):
        elements = []
        current = self.head
        while current:
            elements.append(current.data)
            current = current.next
        print(" -> ".join(elements))

# Example usage:
llist = LinkedList()
llist.append(1)
llist.append(2)
llist.prepend(0)
llist.print_list()  # Output: 0 -> 1 -> 2
llist.delete(1)
llist.print_list()  # Output: 0 -> 2
print(llist.search(2))  # Output: True
print(llist.search(1))  # Output: False
```

This code defines a `Node` class for the individual elements in the linked list and a `LinkedList` class to manage them. The `append`, `prepend`, and `delete` methods allow you to add, insert at the beginning, and remove nodes from the list, respectively. The `search` method checks if an element is present in the list, and the `print_list` method prints out the elements of the list.

In [3]:
%%ai ollama:deepseek-coder-v2

Explain the downsides and overheads associated with linked lists compared to other data structures.


 ### Downsides and Overheads of Linked Lists Compared to Other Data Structures

#### 1. **Memory Allocation**
   - **Downside**: Unlike arrays, where elements are stored contiguously in memory, linked list nodes may be scattered throughout the heap memory due to dynamic allocation during runtime. This can lead to increased cache misses and less efficient use of CPU caches.
   - **Overhead**: The overhead comes from the fact that each node must store not only its data but also a pointer to the next (or previous) node, which adds extra space for pointers.

#### 2. **Access Time**
   - **Downside**: Accessing an element in a linked list requires traversing the list from either the head or tail, depending on whether it's a singly or doubly linked list. This traversal can be inefficient if you need to access elements that are not near these ends.
   - **Overhead**: The time complexity for accessing any random element is O(n) in the worst case, which is less efficient than the constant-time access provided by arrays (O(1)) or more sophisticated indexing mechanisms found in some tree structures.

#### 3. **Insertion and Deletion**
   - **Downside**: Inserting a new node or deleting an existing node requires updating pointers to maintain the integrity of the list, which can be complex for certain operations like inserting/deleting nodes near the head or tail.
   - **Overhead**: While insertion and deletion are straightforward in many other data structures (like arrays with dynamic resizing), they require more steps in linked lists due to the need to change pointers. This complexity increases both in terms of code logic and potential runtime performance impacts, especially for large lists.

#### 4. **No Random Access**
   - **Downside**: Linked lists do not support random access directly because each node is only connected sequentially. You must traverse from the head or tail to reach any specific element.
   - **Overhead**: This limitation can be a significant drawback, especially in scenarios where frequent random accesses are required and efficiency of O(1) access is crucial.

#### 5. **Implementation Complexity**
   - **Downside**: Implementing linked lists requires careful management of pointers to ensure that the list does not become fragmented or lead to memory leaks due to unreferenced nodes.
   - **Overhead**: More complex implementation logic and potential for errors, especially as the size and complexity of applications increase.

#### 6. **Iterators and Iteration**
   - **Downside**: Standard C-style pointers used in linked lists are not inherently iterable like iterators in other data structures (e.g., vectors or arrays). You need to implement custom iterators if you want to traverse the list in a more user-friendly way, which adds complexity and potential for errors.
   - **Overhead**: Custom iteration logic can be error-prone and may not offer many of the conveniences provided by standard library iterators (e.g., range-based loops in C++).

### Conclusion
While linked lists provide dynamic size, ease of insertion/deletion near the head or tail, and are inherently recursive due to their self-referential nature, they also come with significant downsides including inefficient random access, higher memory overhead for pointers, and more complex implementation. These factors should be carefully weighed against the specific requirements and constraints of a given application when choosing between linked lists and other data structures.

### Exercise

Choose one of the points that the LLM raised in response to your last prompt, and ask some follow up questions to dig deeper. Continue the conversation to get more information on any points that interest you or that related to concepts you may not be familiar with. 

The goal is to explore how an ongoing conversation can reveal deeper insights about the problem you are working on and options for solving it. One useful skill is asking an LLM to critique your understanding of an issue. This can help you improve your knowledge and skills.


In [None]:
Instruction 2:

Choose one of the points that the LLM raised in response to your last prompt, and ask some follow up questions to dig deeper. Continue the conversation to get more information on any points that interest you or that related to concepts you may not be familiar with. 

The goal is to explore how an ongoing conversation can reveal deeper insights about the problem you are working on and options for solving it. One useful skill is asking an LLM to critique your understanding of an issue. This can help you improve your knowledge and skills.



### Exercise

Take a look at `../src/linked_list.py` (this code was generated using GPT-4o). Take some time to implement code that enables you to remove a node from the linked list. You can do this by hand, or ask an LLM to do it for you. Then go back and ask the LLM to profile the new code from the point of view of an expert in mitigating denial-of-service attacks, and to help you update the code to make sure it is safe from attack and performs well.

In [4]:
%%bash

cat ../src/linked_list.py

import threading

class Node:
    def __init__(self, data):
        self.data = data
        self.next = None

class LinkedList:
    def __init__(self, max_size=None):
        self.head = None
        self.size = 0
        self.max_size = max_size  # Maximum size limit for the linked list
        self.lock = threading.Lock()  # Lock for concurrency control

    def append(self, data):
        # Validate input data
        if len(data) > 1000:  # Example: Limit data size to prevent excessively large payloads
            raise ValueError("Data size exceeds maximum limit")

        with self.lock:
            # Rate limiting mechanism can be implemented here
            if self.max_size is not None and self.size >= self.max_size:
                raise ValueError("Linked list is full")

            new_node = Node(data)
            if self.head is None:
                self.head = new_node
            else:
                last = self.head
                while last.next:
                 

### Exercise

Consider reasons why some would want to implement a doubly-linked list rather than a singly-linked one. Take a few moments to think of some reasons, Then prompt an LLM with the same question, and compare the model’s answers to the items that you identified. 

In [None]:
Instruction 1:

Take a look at the binary search tree code included in the downloads for this video. What issues would you have to consider before deploying code that uses a BST like this into production? Make a note of any issues you come up with. 

Then pass the code to an LLM and ask it to assume the role of a site reliability engineer and analyze the code and suggest changes to make it perform well and securely in production.See what issues it comes up with, and compare the model’s answer to your own list. Take a close look at any code the LLM writes and ask follow-up questions to make sure you understand what it generated.



In [None]:
Instruction 2:

Take a look at the AVL.py code in the downloads for this video. Try running the code in the Coursera lab environment, in Google colab, or on your own machine. Try to break the code - there are some bugs in the implementation. Hint - think of a role to assign the LLM to help you identify edge cases. Then work with the LLM to fix the bugs you find and enhance the code.

In [None]:
Instruction 1:

Examine the code in the graph.py file in the downloads for this video. This code implements a directed graph. Take a few moments to think try and the code out, see how it works, and identify any issues that might arise if this code was used in production. 

What improvements or changes would you need to implement to make the code ready for production settings? Is there any missing functionality? Or any security vulnerabilities to consider?

After thinking through these issues yourself, pass the code to an LLM and ask it to critique the code and explore these issues for you. Read the model’s output, ask follow up questions if you need to, and then try out any code it generates to check that it works. When you are done, come back to see how that conversation progressed for Laurence.

Instruction 2:

Download the code in graph2.py from the video downloads, which Laurence generated after his back and forth with ChatGPT. 

Try the code out and see if you can break it, or ask an LLM to help you identify edge cases and test those out. Then with the LLM to fix any issues that you identify. 