Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to run the API extention #34

Closed
avinashkr29 opened this issue Sep 6, 2023 · 1 comment
Closed

Unable to run the API extention #34

avinashkr29 opened this issue Sep 6, 2023 · 1 comment

Comments

@avinashkr29
Copy link

avinashkr29 commented Sep 6, 2023

After checking the "api" option under the Session tab, I clicked the "Apply flags/extension and Restart" button as shown below:
Screenshot 2023-09-07 at 1 53 03

This generated the following logs in the colab console:

> 2023-09-06 16:30:28 WARNING:skip module injection for FusedLlamaMLPForQuantizedModel not support integrate without triton yet.
2023-09-06 16:30:28 INFO:Loaded the model in 51.97 seconds.

2023-09-06 16:30:28 INFO:Loading the extension "gallery"...
Running on local URL:  http://127.0.0.1:7860/
Running on public URL: https://<my_old_live_link>/

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)

---------------------------------<Below is the log after I restarted the the server with api option>---------------------------

ERROR:    Exception in ASGI application

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/websockets/websockets_impl.py", line 247, in run_asgi
    result = await self.app(self.scope, self.asgi_receive, self.asgi_send)
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 149, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 75, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 341, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 82, in app
    await func(session)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 289, in app
    await dependant.call(**values)
  File "/usr/local/lib/python3.10/dist-packages/gradio/routes.py", line 536, in join_queue
    session_info = await asyncio.wait_for(
  File "/usr/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
    return fut.result()
  File "/usr/local/lib/python3.10/dist-packages/starlette/websockets.py", line 133, in receive_json
    self._raise_on_disconnect(message)
  File "/usr/local/lib/python3.10/dist-packages/starlette/websockets.py", line 105, in _raise_on_disconnect
    raise WebSocketDisconnect(message["code"])
starlette.websockets.WebSocketDisconnect: 1012
Closing server running on port: 7860
2023-09-06 16:31:32 INFO:Loading the extension "gallery"...
2023-09-06 16:31:32 ERROR:Failed to load the extension "api".
Traceback (most recent call last):
  File "/content/text-generation-webui/modules/extensions.py", line 40, in load_extensions
    extension.setup()
  File "/content/text-generation-webui/extensions/api/script.py", line 10, in setup
    if shared.public_api:
AttributeError: module 'modules.shared' has no attribute 'public_api'
Starting API at http://127.0.0.1:5000/api
Running on local URL:  http://127.0.0.1:7860/
Running on public URL: https://<my_new_live_link>/

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
Output generated in 7.95 seconds (4.90 tokens/s, 39 tokens, context 45, seed 932200172)

I tried the following code to get the response after that. However, I am getting 404 error.
Could you please tell me how do I start the API correctly and get the responses?


# For local streaming, the websockets are hosted without ssl - http://
HOST = '<my_new_live_link>'
URI = f'https://{HOST}/api/v1/generate'


# For reverse-proxied streaming, the remote will likely host with ssl - https://
# URI = 'https://your-uri-here.trycloudflare.com/api/v1/generate'


def run(prompt):
    request = {
        'prompt': prompt,
        'max_new_tokens': 250,
        'auto_max_new_tokens': False,
        'max_tokens_second': 0,

        # Generation params. If 'preset' is set to different than 'None', the values
        # in presets/preset-name.yaml are used instead of the individual numbers.
        'preset': 'None',
        'do_sample': True,
        'temperature': 0.7,
        'top_p': 0.1,
        'typical_p': 1,
        'epsilon_cutoff': 0,  # In units of 1e-4
        'eta_cutoff': 0,  # In units of 1e-4
        'tfs': 1,
        'top_a': 0,
        'repetition_penalty': 1.18,
        'repetition_penalty_range': 0,
        'top_k': 40,
        'min_length': 0,
        'no_repeat_ngram_size': 0,
        'num_beams': 1,
        'penalty_alpha': 0,
        'length_penalty': 1,
        'early_stopping': False,
        'mirostat_mode': 0,
        'mirostat_tau': 5,
        'mirostat_eta': 0.1,
        'guidance_scale': 1,
        'negative_prompt': '',

        'seed': -1,
        'add_bos_token': True,
        'truncation_length': 2048,
        'ban_eos_token': False,
        'skip_special_tokens': True,
        'stopping_strings': []
    }
    print(URI)
    response = requests.post(URI, json=request)
    print(response)

    if response.status_code == 200:
        result = response.json()['results'][0]['text']
        print(prompt + result)


if __name__ == '__main__':
    prompt = "In order to make homemade bread, follow these steps:\n1)"
    run(prompt)```
  
@avinashkr29
Copy link
Author

I was able to solve the issue by changing the final command in colab to the following:

!pip install flask-cloudflared
!python server.py --api --public-api --share --settings /content/settings.yaml --wbits 4 --groupsize 128 --loader AutoGPTQ --model /content/text-generation-webui/models/vicuna-13b-GPTQ-4bit-128g

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant