# LLM Magicoder

## Paper: [Magicoder: Source Code Is All You Need](https://arxiv.org/abs/2312.02120)

## [Github](https://github.com/ise-uiuc/magicoder)
![](https://raw.githubusercontent.com/ise-uiuc/magicoder/830ef3bae6c964d913937e4146817d5ecf1ab106/assets/overview.svg)

## Dataset
### [Magicoder-OSS-Instruct-75K](https://huggingface.co/datasets/ise-uiuc/Magicoder_oss_instruct_75k)

### [Magicoder-Evol-Instruct-110K](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K)

In [1]:
from transformers import pipeline
import torch



In [7]:
MAGICODER_PROMPT = """You are an exceptionally intelligent coding assistant that consistently delivers accurate and reliable responses to user instructions. Also you only provide python code for the solution while keeping the import statements for the package that you use.

@@ Instruction
{instruction}


@@ Response
"""

instruction = "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."


In [4]:
generator = pipeline(
    model="ise-uiuc/Magicoder-S-DS-6.7B",
    task="text-generation",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

config.json:   0%|          | 0.00/742 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/6 [00:00<?, ?it/s]

model-00001-of-00006.safetensors:   0%|          | 0.00/4.84G [00:00<?, ?B/s]

model-00002-of-00006.safetensors:   0%|          | 0.00/4.86G [00:00<?, ?B/s]

model-00003-of-00006.safetensors:   0%|          | 0.00/4.86G [00:00<?, ?B/s]

model-00004-of-00006.safetensors:   0%|          | 0.00/4.86G [00:00<?, ?B/s]

model-00005-of-00006.safetensors:   0%|          | 0.00/4.86G [00:00<?, ?B/s]

model-00006-of-00006.safetensors:   0%|          | 0.00/2.69G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/6 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/4.87k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.37M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/458 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/482 [00:00<?, ?B/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [51]:
MAGICODER_PROMPT = """You are an exceptionally intelligent coding assistant that consistently delivers accurate and reliable responses to user instructions. Also you only provide python code for the solution while keeping the import statements for the package that you use.

@@ Instruction
{instruction}


@@ Response
"""

# instruction = "Write a fibonacci series upto n."
instruction = "Write python code for Whenever an SQL query returns a blank result, send an email via Gmail based on certain parameters. Whenever the user replies to the email, trigger a webhook to parse and insert Gmail response data into the SQL table again."
prompt = MAGICODER_PROMPT.format(instruction=instruction)

In [52]:
result = generator(prompt, max_length=2048, num_return_sequences=1, temperature=1.0)
print(result[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:32014 for open-end generation.


You are an exceptionally intelligent coding assistant that consistently delivers accurate and reliable responses to user instructions. Also you only provide python code for the solution while keeping the import statements for the package that you use.

@@ Instruction
Write python code for Whenever an SQL query returns a blank result, send an email via Gmail based on certain parameters. Whenever the user replies to the email, trigger a webhook to parse and insert Gmail response data into the SQL table again.


@@ Response
This is a complex task that requires knowledge of several different areas of programming, including SQL, Gmail API, and webhooks. Here's a simplified example of how you might approach this task using Python and the `sqlite3`, `smtplib`, and `imaplib` libraries for SQL and email, and the `requests` library for webhooks.

Please note that this is a simplified example and does not include error handling, security measures, or other best practices.

```python
import sqlite

In [54]:
from pprint import pprint
text = result[0]["generated_text"]
start = "```python"
index = text.index(start)
text = text[index+len(start):]
end = "```"
end_index = text.index(end)
text = text[:end_index]
# pprint(text)
print(text)
# print(exec(text))


import sqlite3
import smtplib
import imaplib
import requests
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart

# Connect to SQLite database
conn = sqlite3.connect('my_database.db')
c = conn.cursor()

# SQL query
c.execute("SELECT * FROM my_table")
result = c.fetchall()

# If the query returns a blank result, send an email
if not result:
    msg = MIMEMultipart()
    msg['From'] ='sender@gmail.com'
    msg['To'] ='receiver@gmail.com'
    msg['Subject'] = 'Your Subject'

    body = 'Your message'
    msg.attach(MIMEText(body, 'plain'))

    server = smtplib.SMTP('smtp.gmail.com', 587)
    server.starttls()
    server.login(msg['From'], 'password')
    server.sendmail(msg['From'], msg['To'], msg.as_string())
    server.quit()

# If the user replies to the email, trigger a webhook
imaplib._imaplib.IMAP4.noop()
mail = imaplib.IMAP4_SSL("imap.gmail.com")
mail.login('sender@gmail.com', 'password')
mail.select("inbox")

result, data = mail.search(None, "(UNS