Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How much does it usually take to complete? #5

Closed
Ziyu0118 opened this issue Sep 11, 2023 · 7 comments
Closed

How much does it usually take to complete? #5

Ziyu0118 opened this issue Sep 11, 2023 · 7 comments

Comments

@Ziyu0118
Copy link

首先非常感谢大佬您的工作!我按照您在readme上的说明添加了OpenAI的API key:"sk-xxxxxx" 以及将四个Book, News, Music, Movie的数据移到了data文件夹下,最后在LLM4RS目录执行了“python script/run.py”指令,目前程序运行到了打印参数之后过了2个多小时仍然没有complete。请问大佬这是否属于正常情况?一般在有GPU的情况下跑一次需要要多长时间?麻烦大佬了

@Ziyu0118
Copy link
Author

我检查了自己的log 日志,发现请求一直失败:
2023-09-11 14:23:24,354 Request 5-0 failed with Exception None
2023-09-11 14:23:24,355 Starting request #5-0
2023-09-11 14:23:24,355 Request 5-0 failed with Exception None
2023-09-11 14:23:24,357 Starting request #5-0
2023-09-11 14:23:24,358 Request 5-0 failed with Exception None

不知道是什么原因造成的

@Ziyu0118
Copy link
Author

Same question with huazhen02, 似乎是没有连接成功主机,尝试用作者提供的proxy跑一遍

@Ziyu0118
Copy link
Author

大佬使用的proxy似乎不work,尝试用大佬的curl指令测试网络结果如下:
(env_name) [bianz@login001 LLM4RS]$ curl https://api.openai.com/v1/chat/completions \

-H "Content-Type: application/json"
-H "Authorization: Bearer sk-xxxxxxx"
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Hello!"}]
}'
{
"id": "chatcmpl-7xna0bYORjlclyvn0tzi7JWQC3Dha",
"object": "chat.completion",
"created": 1694486292,
"model": "gpt-3.5-turbo-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 9,
"total_tokens": 18
}
}
请问大佬这种结果意味着什么?需要如何处理

@rainym00d
Copy link
Owner

  1. 一般跑多久

这个要根据max_requests_per_minutemax_tokens_per_minute 和你的网络情况而定。在不触及openai发送请求的上限情况下,max_requests_per_minutemax_tokens_per_minute 这俩参数越大,跑得越快。script/run.py的例子,网络情况良好的情况下,半小时跑完绰绰有余(我自己的网络环境10分钟以内)。

  1. 一般在有GPU的情况下跑一次需要要多长时间

因为是调用openai的api,没有在本机跑模型,所以GPU不会对结果有任何影响。

  1. curl指令测试网络结果代表什么意思

可以看到正确返回了结果,所以理论上可以正常访问openai的API

  1. 调用代码时,网络请求不成功

通过问题3可以看到,你的环境能正常访问openai的API。
如果你是网络直连(没有使用VPN),那你可以尝试将script/run.py里的proxy参数设为None
如果你使用了VPN,那么你可以尝试将proxy的参数设为你的VPN的http(s)端口地址。参考 #1

@rainym00d rainym00d reopened this Sep 12, 2023
@Ziyu0118
Copy link
Author

感谢大佬您的回复!我尝试了您说的方法将proxy设置为None并重新运行,但是仍然不work;之后我按照issue2的方法在post request那里删除了proxy变量,也是仍然不可以。后来我尝试了在openai.py 加了几个print函数,发现queue_of_requests_to_retry.empty() [line 65]在这个过程中始终为False,且下列的代码持续循环,持续print "making next request" 以及 "https://api.openai.com/v1/completions”

        # if enough capacity available, call API
        if next_request:
            #print("making next request...")
            next_request_tokens = next_request.token_consumption
            if (
                available_request_capacity >= 1
                and available_token_capacity >= next_request_tokens
            ):
                # update counters
                available_request_capacity -= 1
                available_token_capacity -= next_request_tokens
                next_request.attempts_left -= 1

                # call API
                print(request_url)
                asyncio.create_task(
                    next_request.call_API(
                        request_url=request_url,
                        request_header=request_header,
                        retry_queue=queue_of_requests_to_retry,
                        save_filepath=save_filepath,
                        status_tracker=status_tracker,
                        proxy=proxy
                    )
                )

后来我查看了下log日志,发现有两个问题,一个是failed过多(Request 43-0),一个是没有invalid URL, 内容显示如下:

2023-09-13 13:31:16,636 Request {'model': 'text-davinci-003', 'prompt': "You are a movie recommender system now.\nInput: Here is the watching history of a user: Gattaca, Armageddon, Big, Babes in Toyland, Gladiator. Based on this history, please rank the following candidate movies: (A) Con Air (B) Mulan (C) Nikita (D) Donnie Brasco (E) Star Wars: Episode I - The Phantom Menace\nOutput: The answer index is D C B A E.\nInput: Here is the watching history of a user: Sling Blade, Animal House, The Wizard of Oz, Blood Simple, My Life as a Dog. Based on this history, please rank the following candidate movies: (A) Heavenly Creatures (B) Gandhi (C) Diner (D) Antonia's Line (E) A Room with a View\nOutput: The answer index is", 'max_tokens': 20, 'temperature': 0, 'top_p': 1, 'frequency_penalty': 0.0, 'presence_penalty': 0.0, 'stop': '\n', 'logprobs': None, 'logit_bias': {}} failed after all attempts. Saving errors: [, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ]

2023-09-13 13:31:16,656 Task exception was never retrieved
future: <Task finished name='Task-3801' coro=<APIRequest.call_API() done, defined at /home/ziyubian/LLM4RS/src/api/openai.py:175> exception=TypeError('Object of type InvalidURL is not JSON serializable')>
Traceback (most recent call last):
File "/home/ziyubian/LLM4RS/src/api/openai.py", line 214, in call_API
append_to_jsonl({"task_id": self.task_id, "target": self.target, "target_index": self.target_index, "pos": self.pos, "request": self.request_json, "result": self.result}, save_filepath)
File "/home/ziyubian/LLM4RS/src/api/openai.py", line 234, in append_to_jsonl
json_string = json.dumps(data)
File "/home/ziyubian/anaconda3/envs/LLM4RS/lib/python3.9/json/init.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/home/ziyubian/anaconda3/envs/LLM4RS/lib/python3.9/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/home/ziyubian/anaconda3/envs/LLM4RS/lib/python3.9/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/home/ziyubian/anaconda3/envs/LLM4RS/lib/python3.9/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.class.name} '
TypeError: Object of type InvalidURL is not JSON serializable

我是尝试在ubuntu的anaconda虚拟环境下运行这个程序,网络没有问题,可以正常访问openai的API, 想问下大佬还有什么可以用的解决方法吗?麻烦您了

@rainym00d
Copy link
Owner

抱歉,我目前找不到你那边发生了什么错误,大概率还是网络设置的原因。我重新clone了这个仓库,也用anaconda新建了一个环境。在配置好proxy之后,是能够正常使用的。

image

@rainym00d rainym00d reopened this Sep 15, 2023
@Ziyu0118
Copy link
Author

多谢大佬的及时回复,我再仔细检查一下看看是哪里出了问题,麻烦啦

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants