# Code Buddy: Your Technical Code Assistant
Made by: Guru Deep Singh

Code Buddy is an LLM powered assistant allowing you to ask it questions regarding your coding language issues

In [21]:
#imports
import os
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display, update_display

In [22]:
# Class for creating a LLM powered assistant

class CodeBuddy:
    ALLOWED_BACKENDS = {"ollama", "openai"}
    def __init__(self, backend="ollama", modelName= "llama3.2"):
        if backend not in self.ALLOWED_BACKENDS:
            raise ValueError(f"Invalid backend '{backend}'. Allowed options are: {', '.join(self.ALLOWED_BACKENDS)}")
           
        self.backend = backend
        self.modelName = modelName
        print(f"Code Buddy using {self.backend} backend and {modelName} model")
        self.__initialize()

    def __initialize(self):
        if self.backend == "openai":
            load_dotenv(override=True)
            apiKey = os.getenv('OPENAI_API_KEY')
            if not apiKey:
                print("No API key was found")

            else:
                self.__requestHandler = OpenAI()

        elif self.backend == "ollama":
            # Setting up Constants
            self._OLLAMA_API = "http://localhost:11434/api/chat"
            self._HEADERS = {"Content-Type": "application/json"}
            self.__requestHandler = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
            
    def __nonStreamCall(self,userPrompt, systemPrompt):
        response = self.__requestHandler.chat.completions.create(
            model = self.modelName,
            messages = [
                {"role": "system", "content": systemPrompt},
                {"role": "user", "content": userPrompt}
                ])
        return response.choices[0].message.content

    def __streamCall(self,userPrompt, systemPrompt):
        stream = self.__requestHandler.chat.completions.create(
                model = self.modelName,
                messages = [
                    {"role": "system", "content": systemPrompt},
                    {"role": "user", "content": userPrompt}
                ],
                stream=True
            )
            
        for chunk in stream:
            response = chunk.choices[0].delta
            if hasattr(response, "content"):
                yield response.content



    def run(self, userPrompt, stream = False, 
            systemPrompt ="You are a helpful and informed coding agent.\
            You are given a piece of code. You have to check if the code is correct or is incorrect.\
            You need to explain the code in beginner friendy way.\
            You are also allowed to give suggestions on improvement of code for runtime optimization.\
            Give your answer in Markdown." ):
        
        if len(systemPrompt.strip()) != 0 and len(userPrompt.strip()) != 0:
            if not stream:
                return self.__nonStreamCall(userPrompt, systemPrompt)

            else:
                return self.__streamCall(userPrompt, systemPrompt)
                
            
    

        

In [23]:
# Helper functions for streaming
def stream_markdown(generator):
    buffer = ""
    display_handle = display(Markdown(""), display_id=True)

    for part in generator:
        buffer += str(part)
        cleaned = buffer.replace("```", "").replace("markdown", "")
        update_display(Markdown(cleaned), display_id=display_handle.display_id)

def stream(generator):
    for part in generator:
        print(part, end="", flush=True)

In [24]:
question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""
# Assistant powered by OpenAI API 
myBuddy = CodeBuddy("openai", "gpt-4o-mini")


Code Buddy using openai backend and gpt-4o-mini model


In [25]:
output = myBuddy.run(userPrompt = question, stream=False)
display(Markdown(output))

## Code Explanation

This line of code is using a combination of Python constructs to yield elements from a generator. Let's break it down step-by-step:

1. **Set Comprehension**:
   - `{book.get("author") for book in books if book.get("author")}` is a set comprehension. It creates a set of unique authors from a collection called `books`.
   - `for book in books`: This part iterates over each item (`book`) in the iterable `books`.
   - `book.get("author")`: This line retrieves the value associated with the key `"author"` from the dictionary `book`. If the key does not exist, `get` will return `None` instead of raising an error.
   - `if book.get("author")`: This condition ensures that only books that have an author (i.e., the author's name is not `None` or an empty string) will be included in the resulting set.

2. **Yield From**:
   - The `yield from` statement allows the function to yield all values from the set created by the comprehension. This means the code is likely part of a generator function, and it will yield each author one at a time when called within that function.

### Summary
In summary, this code gets a unique list of authors from the list of books, filtering out any that don't have an author, and then yields each author individually as a generator.

## Code Behavior Example

If you have the following `books` list:

```python
books = [
    {"title": "Book 1", "author": "Author A"},
    {"title": "Book 2", "author": "Author B"},
    {"title": "Book 3", "author": None},
    {"title": "Book 4", "author": "Author A"},
]
```

Using the code provided:

```python
yield from {book.get("author") for book in books if book.get("author")}
```

This will yield:
- `Author A`
- `Author B`

## Suggestions for Improvement

1. **Use of `set` vs `list`**: 
   - If the order of authors is not important, using a `set` is efficient because it automatically handles duplicates. If you want to preserve the order in which authors first appear, you might consider using a list with a check for duplicates.

2. **Error Handling**: 
   - If `book` can sometimes not be a dictionary or certain keys are sometimes missing, consider adding additional error handling or checks for data integrity.

3. **Performance Consideration**:
   - If `books` is very large, and the set comprehension consumes a lot of memory, you might want to investigate using a generator expression instead to yield authors one at a time without creating a full set in memory. This can be done by directly using `yield` inside a loop rather than using `yield from`.

### Example with Generator:
```python
def get_authors(books):
    seen_authors = set()
    for book in books:
        author = book.get("author")
        if author and author not in seen_authors:
            seen_authors.add(author)
            yield author
```

This version yields authors one by one while keeping track of those already seen, which might save memory if the list of books is large.

In [26]:
stream_markdown(myBuddy.run(userPrompt=question, stream=True))

### Code Explanation

The line of code provided is using Python's `yield` statement combined with a generator expression and a set comprehension. Let's break it down:

1. **Context**: 
   - This line of code is likely part of a generator function, which is a special type of function in Python that allows you to iterate over data without storing the entire result in memory at once.

2. **Set Comprehension**: 
   - `{book.get("author") for book in books if book.get("author")}`: 
     - This part of the code creates a set of unique authors from a collection of books.
     - `books` is expected to be a list (or any iterable) of dictionaries, where each dictionary represents a book with various details, including an "author".
     - The expression `book.get("author")` attempts to retrieve the "author" key from each `book` dictionary. If a book doesn’t have an "author", it will return `None`. 
     - The `if book.get("author")` condition ensures that only books with a valid author are included in the set, effectively filtering out any entries with a missing or empty author.

3. **`yield from`**:
   - The `yield from` statement is used to yield all values from another iterable (in this case, the set of authors). This means that this line will generate each author one by one.

### What the Code Does:

- The line of code retrieves all the unique authors from a given list of books, filtering out any entries where the author is not specified. It generates these unique authors lazily, meaning they are produced one by one and do not need to be stored in a list or set upfront, which can save memory in large datasets.

### Suggestions for Improvement:

1. **Performance**:
   - If `books` is a very large collection, consider checking if the data structure can be optimized for faster lookups or accesses (e.g., using a different data structure suitable for the use case).

2. **Handling Duplicates**:
   - Using a set is a good approach to ensure authors are unique. However, if the order of authors matters, consider using an `OrderedDict` or maintaining a list and converting it to a set if you need exclusiveness after collecting.

3. **Error Handling**:
   - Depending on the structure of the `books`, you might want to add error handling for cases where the data format might be inconsistent, leading to potential key errors.

### Final Note:

Overall, this line is correct in terms of syntax and logic given the expected structure of `books`. It effectively combines filtering and uniqueness-generating features into a single, concise statement.None

# Ollama based Code Buddy

In [29]:
# Assistant powered by Ollama 
myBuddyOllama = CodeBuddy("ollama", "llama3.2")

Code Buddy using ollama backend and llama3.2 model


In [30]:
# Pulling model
!ollama pull llama3.2

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling dde5aa3fc5ff: 100% ▕██████████████████▏ 2.0 GB                         [K
pulling 966de95ca8a6: 100% ▕██████████████████▏ 1.4 KB                         [K
pulling fcc5a6bec9da: 100% ▕██████████████████▏ 7.7 KB                         [K
pulling a70ff7e570d9: 100% ▕██████████████████▏ 6.0 KB                         [K
pulling 56bb8bd477a5: 100% ▕██████████████████▏   96 B                         [K
pulling 34bb5ab01051: 100% ▕██████████████████▏  561 B                         [K
verifying sha256 digest [K
writing manifest [K
success [K[?25h[?2026l


In [31]:
outputOllama = myBuddyOllama.run(userPrompt = question, stream=False)
display(Markdown(outputOllama))

**Code Explanation**

The given code uses a feature called generator expression to extract the authors of books from a collection of books.

Here's a breakdown of how it works:

* `yield from`: This keyword is used to delegate to another iterable (in this case, the generator expression) and yield its values directly.
* `{ ... }`: This is a dictionary comprehension that creates a new dictionary. However, in this context, it's not actually creating a dictionary, but rather a string of attribute accesses.
* `book.get("author")`: This line uses the `get` method on each book object to access its "author" attribute.
* `(...] for book in books if ...`: This is a generator expression that iterates over the `books` collection. It only yields values from the iteration if the condition `book.get("author") is not None`.

**What Does it Do?**

The code returns an iterator that produces authors of books where the author information exists.

In other words, it takes a list or other iterable of book objects and returns an iterator over individual authors, skipping any books that don't have "author" attributes set.

Here's a simple example:

```python
books = [
    {"title": "Book1", "author": "John"},
    {"title": "Book2"},
    {"title": "Book3", "author": "Alice"}
]

for author in yield from {book.get("author") for book in books if book.get("author")}:
    print(author)  # Output: John, Alice
```

**Improvement Suggestions**

There's not much room for improvement here, as the code is already quite readable and performs its intended task. However, for better performance, we can consider a few optimizations:

1. **Preprocessing**: Instead of calling `get` on each book object in the expression, you could precompute the authors by iterating over all books once.

    ```python
authors = [book["author"] for book in books if "author" in book]
```

    Then use this list directly with generator expressions:

    ```
for author in yield from {author for _ in authors}
```


2. **Lazy Iteration**: The original code doesn't use any lazy iterators, which means that the whole `books` collection is evaluated at once. If you're dealing with very large datasets and can afford to spend a bit more on computations upfront, consider using `itertools.chain()` or an actual iterator (like `.list()`):

```python
import itertools

# ... somewhere in your code before using yield from ...

author_list = list(book["author"] for book in books if "author" in book)
```

    Alternatively, always try to use existing iterators where possible:

```python
author_set = set(book["author"] for book in books if "author" in book)

# Later...

for _ in author_set:
    yield from {book.get("author")}
```

In [32]:
stream_markdown(myBuddyOllama.run(userPrompt=question, stream=True))

**Code Explanation**

This Code is written in Python and utilizes a feature called async generators (with the `async` and `await` keywords). It's used to iterate over a list of objects (`books`) and yield the authors of each object.

Here's what's happening step-by-step:

- `{... for book in books if book.get("author")}`: This part is an **expression**, which is also called an iterable. In this case, it starts with an expression enclosed by curly brackets `{`, followed by a keyword `for`. The expression itself contains three elements:
  - `book`: The item that the loop will iterate over.
  - `books`: The list to be processed using `for` loop.
  - `if book.get("author")`: A condition to only consider items in the list if they have something called `"author"`, which is presumably stored within each book object.

- `${...}`: When used before a word, like here as `{...}`, it creates an iterable using **dictionary-like** access. 

So overall, this expression says "Take `book` from the list and include in our new iterable if it has been verified that its `"author"` key is present."

- `yield` keyword makes the entire code block act like a generator.

Now let's break down your statement.

`yield from {book.get("author") for book in books if book.get("author")}`:

`yield from` keyword does two things:
1. **it adds all items** (returns an iterator) to `your generator here`.
2. **then it gives up control of its execution**

In simpler words, we can say that the expression in the dictionary (`{book.get("author") for book in books if book.get("author")}`) generates a list, which contains only those items where `"author"` is present, to be used like an iterable inside the larger `yield from` statement.

Here's how you might imagine this process. Let's say `books` has multiple elements `(book A, book B)`. The expression in our original dictionary returns the author for each book (a new list). If the second book doesn't have `"author"`, it immediately won't be included.

python
# before: books=[{"title": "Title", "genre": ["Action", "Comedy"]}, {"title":"Title2","author":"Writer4"}]
# after:
books=[{"title": "Title", "genre": ["Action", "Comedy","Drama"]},{'title':' Title2','a': 'writertofollow'}]


`yield from {...}` will then return each book's author individually.

python
return [
    yield "Writer4",
    yield "unknown"
]

# the returned output:
[
    Writer4,
    unknown
]

Let me rephrase the `yield`:

`yield from {book.get("author") for book in books if book.get("author")} -> author_from_book_1, author_from_book_2`

### Example Code

python
import asyncio


class AuthorBook:
    def __init__(self, title, genre=None, author=''):
        self.title = title
        self.genre = genre
        self.author = author

    @property
    def available_authors(self):
        # return "Author1" if self.author,
        return f"{self.author}"


async def get_all_author_from_book_list(books_list):
    result_list = []

    async with asyncio.AsEnsure:
        async for book in books_list:             # this iterates over the books list of dict.
            author_name = await (  # using property, because dictionary cannot do this
                book.available_authors.as_value      # .get would also do this operation at runtime though, and then convert that into an
                                                                tuple, rather than a name,
                                )
            result_list.append(author_name)

    # since they each get their own separate generator
    return [author_name async for author_name in await asyncio.gather(*result_list)]  # because of this line.



# the books list of dict.
books_list = [
    AuthorBook('title1', genre=None, author='Writer1'),
    AuthorBook('Title2', genre=None, author='Writer4')
]

# getting results
async def main_function():
    result_list = await get_all_author_from_book_list(books_list)
    
    # the returned output: ['Writer1', 'Writer4']
    return result_list

if __name__ == '__main__':
    loop = asyncio.get_event_loop()  # it calls getEventLoop() instead in Python >=3.8
    running_result = await loop.run_until_complete(main_function())
    print(running_result)

### Advice for Improved Code:

To make a best performance of iteration through a list with `async/await`.

python
import asyncio

my_list=[]
for i in range(1000):
    my_list.append(i)

async def task():
        # to use the full capabilities
    return sum(my_list)
        
            
# async operation for loop can't be done here (the issue will be memory and performance) but by using a dictionary with list of coroutine objects:

async def async_coroutine(i) :
            ans = i**2 
            return ans
            
async def main_function():
    await asyncio.gather(async_task1,async_task2)
    
async def async_task1() :
        
        for i in range (10):
                
            yield from async_coroutine(5*i)

async def async_task3() :
        
        for i in  range(10):
                yield from async_coroutine(i)
        
        
if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    running_result = await loop.run_until_complete(main_function())
    print(running_result)


*   Always prefer to use the asynchronous operation, especially when dealing with time-consuming operations in your routine (I/O bound).