Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support chatGPT source connector #4047

Closed
1 of 2 tasks
xwm1992 opened this issue May 29, 2023 · 10 comments · Fixed by #4817
Closed
1 of 2 tasks

[Feature] Support chatGPT source connector #4047

xwm1992 opened this issue May 29, 2023 · 10 comments · Fixed by #4817
Labels
feature help wanted Extra attention is needed

Comments

@xwm1992
Copy link
Contributor

xwm1992 commented May 29, 2023

Search before asking

  • I had searched in the issues and found no similar issues.

Feature Request

We can using openai sdk to support chatGPT source connector.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@pandaapo
Copy link
Member

I am confused. How can ChatGPT provide the function of Source? I always feel that Source is at least one form of storage.

@xwm1992
Copy link
Contributor Author

xwm1992 commented May 31, 2023

I am confused. How can ChatGPT provide the function of Source? I always feel that Source is at least one form of storage.

use the openai sdk in the chatgpt-source-connector, which likes an http server receive the request and using the openai sdk send to chatgpt service, then get response convert to cloudevents.

@Pil0tXia Pil0tXia added the help wanted Extra attention is needed label Jan 10, 2024
@jevinjiang
Copy link
Contributor

I want to try .
The user inputs text, and I use GPT to generate a fixed format JSON and return it to me.
This is the prompt I designed for this issue :

For the following text, extract the following information:
subject: What is the subject matter of the text? If this information is not found, output none.
datacontenttype: What type of data or information is contained in the text? If this information is not found, output none.
data: Extract all relevant data or information from the text.If this information is not found, output none.

Format the output as JSON with the following keys:
subject
datacontenttype
data

text:  the subject is eventmeshSubject , the datacontenttype is json , data is eventmesh

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":
```json
{
	"subject": string,
	"datacontenttype": string,    
	"data": string,
}

@Pil0tXia
Copy link
Member

It's not a good idea to let GPT3.5 generate any fixed data structure.

@jevinjiang
Copy link
Contributor

@Pil0tXia Ok Thank you. I understand what you mean. I will continue to improve prompt and convert any input of the user into the result of cloudevents specification, instead of just limiting it to fixed.

@jevinjiang
Copy link
Contributor

@Pil0tXia hi this is my redesign prompt,Define the data structure within the data,The format in the data changes with the datacontenttype, which can be specified as XML, JSON, or text,Does this design meet the requirements?

You are an AI assistant named CloudEventsConverter. Your task is to convert input text provided by the user into a CloudEvents-formatted JSON object.

For the following text, extract the following information:

Create a CloudEvents-formatted JSON object with the following fields:
   - specversion: Set to "1.0" (the current CloudEvents specification version)
   - type: Set to \\\ targetType \\\ (you can customize this if needed)
   - source: Set to \\\targetSource\\\ (you can customize this if needed)
   - id: Generate a unique identifier for the event (e.g., "A234-1234-1234")
   - time: Set to the current timestamp in ISO 8601 format (e.g., "2023-03-25T12:34:56.789Z")
   - datacontenttype: Set to \\\ application/json  \\\
   - data: Set to the input text provided by the user
	 	\\\
		-orderNo
		-address
		-phone
		\\\

text: \\\ 天津门店的13356288979的用户下单,订单号为11221122 \\\

Return the CloudEvents-formatted JSON object to the user,The format of the data field matches the datacontenttype.

Additionally, you should provide a brief explanation of the CloudEvents specification and why it is useful for describing event data in a common way across different platforms and services.

Make sure to handle cases where the user's input is unclear or does not contain any text to convert. In such cases, politely ask the user to provide input text.

Your responses should be concise, easy to understand, and focused on the task of converting input text to CloudEvents format.

result:

{
  "specversion": "1.0",
  "type": "targetType",
  "source": "targetSource",
  "id": "C789-6789-6789",
  "time": "2024-03-25T10:30:15.000Z",
  "datacontenttype": "application/json",
  "data": {
    "orderNo": "11221122",
    "address": "天津门店",
    "phone": "13356288979"
  }
}

@Pil0tXia
Copy link
Member

@jevinjiang

Actually, you don't need to ask GPT to return data in JSON format. What data GPT generates is determined by the user. The GPT source connector only needs to receive the complete response from GPT and wrap it in a CloudEvent.

@jevinjiang
Copy link
Contributor

Just convert it to data based on user input, and the format is passed by the user. Only a part of the functionality of my prompt is needed, right? Actually, the current data is based on the user-defined datacontenttype, and the attributes in the data are also user-defined.All data in \ \ \ is entered by the user. I should just remove the wrapper of cloudevent, accept the return of GPT in the code, and then set the data

@jevinjiang
Copy link
Contributor

@Pil0tXia
After the changes, do you see if they meet the requirements

You are an AI assistant named DataConverter. Your task is to convert input text provided by the user into a string.

For the following text, extract the following information:

datacontenttype is \\\ application/json \\\

Create a string with the following fields:

\\\

-year

-up

-down

\\\

text: \\\ 据晚点财经,由于净利息收入的减少,2023 年招商银行营业总收入同比下滑 1.64%、至 3391.23 亿元。这背后是去年招行公司和个人活期存款不同程度减少,定期存款分别同比大增 20.89% 和 48.58%,存款利率下调不足以对冲规模大幅上升对银行的负面影响。 具体到客户存款结构,继续着存在已久的 “二八法则”。截至 2023 年底,招行共吸纳 13.32 万亿元个人存款。其中,金葵花及以上客户(在招商银行月日均总资产超过 50 万元的个人)占 10.82 万亿元、比例达到 81.28%。 这也是招行这一级别客户的总资产首次超过 10 万亿元。而他们只占招行个人总户数的 2.35%。 同期,在招行月日均全折人民币总资产超过 1000 万元的私人银行客户,超过了 14.88 万人,较 2019 年底多了 82%。 \\\

Return the string to the user,The format of the data field matches the datacontenttype.

Make sure to handle cases where the user's input is unclear or does not contain any text to convert. In such cases, politely ask the user to provide input text.

result:

{
"year": 2023,
"up": 20.89,
"down": 1.64
}

@Pil0tXia
Copy link
Member

@jevinjiang I don't think you need to define a prompt. The prompt is what the user sends to GPT, and the role of the source connector is to receive GPT's response.

jevinjiang added a commit to jevinjiang/eventmesh that referenced this issue Apr 7, 2024
jevinjiang added a commit to jevinjiang/eventmesh that referenced this issue Apr 8, 2024
jevinjiang pushed a commit to jevinjiang/eventmesh that referenced this issue Apr 8, 2024
jevinjiang added a commit to jevinjiang/eventmesh that referenced this issue Apr 10, 2024
jevinjiang added a commit to jevinjiang/eventmesh that referenced this issue Apr 10, 2024
jevinjiang added a commit to jevinjiang/eventmesh that referenced this issue Apr 10, 2024
jevinjiang added a commit to jevinjiang/eventmesh that referenced this issue Apr 10, 2024
jevinjiang pushed a commit to jevinjiang/eventmesh that referenced this issue Apr 10, 2024
jevinjiang pushed a commit to jevinjiang/eventmesh that referenced this issue Apr 10, 2024
jevinjiang pushed a commit to jevinjiang/eventmesh that referenced this issue Apr 15, 2024
jevinjiang pushed a commit to jevinjiang/eventmesh that referenced this issue Apr 16, 2024
jevinjiang pushed a commit to jevinjiang/eventmesh that referenced this issue Apr 17, 2024
jevinjiang pushed a commit to jevinjiang/eventmesh that referenced this issue Apr 17, 2024
jevinjiang pushed a commit to jevinjiang/eventmesh that referenced this issue Apr 19, 2024
jevinjiang pushed a commit to jevinjiang/eventmesh that referenced this issue Apr 20, 2024
jevinjiang pushed a commit to jevinjiang/eventmesh that referenced this issue Apr 20, 2024
pandaapo pushed a commit that referenced this issue Apr 21, 2024
* [ISSUE #4047] Support chatGPT source connector

* [ISSUE #4047] Add OpenAI configuration and adjust DTO

* [ISSUE #4047] Join parse request support

* [ISSUE #4047] impl Parse request

* [ISSUE #4047] fix code style

* [ISSUE #4047] fix code style

* [ISSUE #4047] fix dependencies check failed

* [ISSUE #4047] fix dependencies check

* [ISSUE #4047] fix license check

* [ISSUE #4047] fix review question

* [ISSUE #4047] fix review question

* [ISSUE #4047] add default value

* [ISSUE #4047] fix test

* [ISSUE #4047] default timeout value is zero , not timeout .

* [ISSUE #4047] fix review

* [ISSUE #4047] fix license check

---------

Co-authored-by: JiangShuJu <shuju.jiang@baozun.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants