<img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true" alt="alt text" width="700px">

# **DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model**

<img src="https://dp-cdn-deepseek.obs.cn-east-3.myhuaweicloud.com/api-docs/version_history.png" alt="alt text" width="800px">

DeepSeek最佳的三个特征：

1. 最大参数量236B，中文各项指标领先！huggingface公开开源最佳版本模型！
2. 便宜，便宜，便宜！1元/百万inputs, 2元/百万outputs.
3. 足够的简洁。最强模型就一个DeepSeek-V2，没有其他版本，官方API简单易懂。
4. [DeepSeek API 创新采用硬盘缓存，价格再降一个数量级。](https://api-docs.deepseek.com/zh-cn/news/news0802)0.1元/百万tokens

<img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/model_price.png?raw=true" alt="alt text" width="400px" >



## 单轮对话

In [None]:
# Please install OpenAI SDK first: `pip3 install openai`

from openai import OpenAI

client = OpenAI(api_key="sk-885eb26c4d574c778155e78794a4e38a", base_url="https://api.deepseek.com")

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": "请介绍一下你自己"},
    ],
    max_tokens=1024,
    temperature=0.7,
    stream=False
)

print(response.choices[0].message.content)

In [None]:
# 检查账户余额
import requests

url = "https://api.deepseek.com/user/balance"

payload={}
headers = {
  'Accept': 'application/json',
  'Authorization': 'Bearer sk-885eb26c4d574c778155e78794a4e38a'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

## 多轮对话

In [None]:
from openai import OpenAI
client = OpenAI(api_key="sk-885eb26c4d574c778155e78794a4e38a", base_url="https://api.deepseek.com")

# Round 1
messages = [{"role": "user", "content": "What's the highest mountain in the world?"}]

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=messages
)

print(response.choices[0].message)
print(f"Messages Round 1: {messages}")


In [None]:
# Round 2：

messages.append(response.choices[0].message) # 要将第一轮中模型的输出添加到 messages 末尾
messages.append({"role": "user", "content": "What is the second?"}) # 将新的提问添加到 messages 末尾
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=messages
)

print(response.choices[0].message)
print(f"Messages Round 2: {messages}")

## Text-to-SQL：表格数据分析

根据文本描述编写对应的SQL或者pandas代码

In [None]:
!pip3 install ucimlrepo

In [None]:
from ucimlrepo import fetch_ucirepo 
  
# fetch dataset 
adult = fetch_ucirepo(id=2) 
  
# data (as pandas dataframes) 
X = adult.data.features 
y = adult.data.targets 
  
# metadata 
print(adult.metadata) 
  
# variable information 
print(adult.variables) 


In [None]:
print(adult.metadata['additional_info']['variable_info'])

In [None]:
col_info = adult.metadata['additional_info']['variable_info']

In [None]:
dataset_desc = "下面是数据集的描述，如果列的类型是continuous，那么会展示列名与类型，否则会展示列名与对应的分类数据：" + col_info + "."

In [None]:
prompt_question = "假设输入的数据集是df，请你按照下面要求写出对应的pandas代码：男女各有多少人，以及他们各自的平均年龄是多少？"

In [None]:
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "system", "content": "你是一个专业的代码专家，请根据我的要求完成对应的代码编写"},
        {"role": "user", "content": dataset_desc +prompt_question },
    ],
    max_tokens=1024,
    temperature=0.7,
    stream=False
)

print(response.choices[0].message.content)

In [None]:
prompt_question = "假设输入的数据集是df，请你按照下面要求写出对应的pandas代码：男女各有多少人，以及他们各自的平均年龄是多少？ 注意：不要给出任何解释，直接写出正确代码即可"

In [None]:
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "system", "content": "你是一个专业的代码专家，请根据我的要求完成对应的代码编写"},
        {"role": "user", "content": dataset_desc },
        {"role": "user", "content": prompt_question },
    ],
    max_tokens=1024,
    temperature=0.7,
    stream=False
)

print(response.choices[0].message.content)

In [None]:
prompt_question = "假设输入的数据集是df，请你按照下面要求写出对应的pandas代码：绘制柱状图：统计不同education人的平均年龄,并且柱状图上显示年龄数值。 注意：不要给出任何解释，直接写出正确代码即可"

In [None]:
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "system", "content": "你是一个专业的代码专家，请根据我的要求完成对应的代码编写"},
        {"role": "user", "content": dataset_desc },
        {"role": "user", "content": prompt_question },
    ],
    max_tokens=1024,
    temperature=0.7,
    stream=False
)

print(response.choices[0].message.content)

In [None]:
!pip3 install matplotlib

In [None]:
df = X.copy()

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# 计算不同education的平均年龄
avg_age_by_education = df.groupby('education')['age'].mean().sort_values(ascending=False)

# 绘制柱状图
avg_age_by_education.plot(kind='bar', figsize=(10, 6))

# 在柱状图上显示数值
for i, v in enumerate(avg_age_by_education):
    plt.text(i, v, f"{v:.2f}", ha='center', va='bottom')

plt.title('Average Age by Education')
plt.xlabel('Education')
plt.ylabel('Average Age')
plt.show()

In [None]:
prompt_question = "假设输入的数据集是df，请你按照下面要求写出对应的pandas代码：使用seaborn绘制柱状图：统计不同education人的平均年龄,并且柱状图上显示年龄数值。 注意：不要给出任何解释，直接写出正确代码即可"

In [None]:
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "system", "content": "你是一个专业的代码专家，请根据我的要求完成对应的代码编写"},
        {"role": "user", "content": dataset_desc },
        {"role": "user", "content": prompt_question },
    ],
    max_tokens=1024,
    temperature=0.7,
    stream=False
)

print(response.choices[0].message.content)

In [None]:
# 构建机器学习模型
prompt_question = """
假设输入的数据集是df，请你按照下面要求写出对应的pandas代码：
现在要求使用xgboost构建一个二分类模型，假设目标label是Income列，df的列则是输入数据。
请注意：
1. 对连续和离散特征的正确处理
2. 对缺失值的正确处理
3. 对数据集进行train/dev/test的划分
4. 使用auc,acc,f1-score进行评估
"""


In [None]:
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "system", "content": "你是一个专业的代码专家，请根据我的要求完成对应的代码编写"},
        {"role": "user", "content": dataset_desc },
        {"role": "user", "content": prompt_question },
    ],
    max_tokens=2056,
    temperature=0.7,
    stream=False
)

print(response.choices[0].message.content)