# fastapi 请求方法

## 方法1：调用前端


具体代码见[app.py](./app.py)

```bash
    $ cd path/to/code_arxiv_summarizer
    $ streamlit run app.py
```
进行可交互访问

## 方法2：request访问


启动后端服务：
```bash
$ uvicorn backend:app --host 0.0.0.0 --port 8076 --reload
```

则端口设为8076，访问url(http://host_ip:port)为：
```python
req_url = "http://61.241.103.32:8076"
```

### get_links

具体接受格式如下（其中未给的参数会自动用default补全，因此实际什么都不输入也会有返回值）：

```python
from fastapi import FastAPI,Body
import json
from typing import List,Union,Literal,Dict
from pydantic import BaseModel,Field
class LinkRequest(BaseModel):
    key_word:Union[str,None] = None
    proxies:Union[dict,None] = None
    headers:Union[dict,None] = None
    max_num:int = 5
    line_length:int = 15
    searchtype:str = "all"
    abstracts:str = "show"
    order:str = "-announced_date_first"
    size:int = 50
    show_meta_data:bool = True
    daily_type:str = "cs"
    max_retry:int = 3
    wait_fixed:int = 1000
    ```

若访问正常，访问的结果为：
```json
{
  "links": [
    "https://arxiv.org/pdf/xxx",
    "https://arxiv.org/pdf/xxx",
  ],
  "titles": [
      "xxx",
      "xxx",
    ],
  "abstract": [
    "xxx",
    "xxx",
    ],
  "authors":
    [
    "xxx",
    "xxx",
    ],
  "error": null
}
```
> noted that titles, abstract, authors are all in markdown format

简易版params为
```json
params = {
    "key_word":None
}
```
标准版为：
```json
params = {
    "key_word":None,
    "proxies":None,
    "headers":None,
    "max_num":5,
    "line_length":15,
    "searchtype":"all",
    "abstracts":"show",
    "order":"-announced_date_first",
    "size":50,
    "show_meta_data":True,
    "daily_type":"cs",
    "max_retry":3,
    "wait_fixed":1000
}
```

示例：

In [20]:
import requests
import logging
import time
from submodule.my_utils import custom_response_handler
import yaml
import json
import os
import sys


def init_config(set_none:bool = False):
    yaml_path = './config.yaml'
    if os.path.exists(yaml_path):
        with open(yaml_path, 'r') as config_file:
            config = yaml.safe_load(config_file)
    else:
        logging.error(f'Config file not found at {yaml_path}')
        sys.exit(1)
    openai_info = config["openai"]
    with open(openai_info['prompts_path'], 'r') as f:
        prompts = json.load(f)
    arxiv_info = config['arxiv']
    nougat_info = config["nougat"]
    proxy = arxiv_info['proxy']
    headers = arxiv_info['headers']
    base_url = openai_info['base_url']
    if set_none:
        proxy = headers = None
    return openai_info, arxiv_info, nougat_info,prompts, proxy, headers,base_url
openai_info, arxiv_info, nougat_info,prompts, proxy, headers,base_url = init_config(set_none=False)
req_url = "http://61.241.103.32:5010"
# req_url = "http://127.0.0.1:8000"
link_url = req_url + '/get_links/'
keyword = "graph neural network"
params = {
    "key_word": keyword,
}
start_time = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
print(f'start time:{start_time}')
now_time = time.time()
response = requests.post(link_url, json=params)
json_info = custom_response_handler(response)
if not "error" in json_info:
    print(json_info)
else:
    print(json_info["error"])


start time:2023-11-22 11:19:52


INFO:root: process success


{'links': ['https://arxiv.org/pdf/2311.12741', 'https://arxiv.org/pdf/2311.12657', 'https://arxiv.org/pdf/2311.12644', 'https://arxiv.org/pdf/2311.12630', 'https://arxiv.org/pdf/2311.12616'], 'titles': ['[$$~~~~Content~Augmented~Graph~Neural~Networks~$$](https://arxiv.org/pdf/2311.12741)', '[$$~~~~Carbohydrate~NMR~chemical~shift~predictions~using~E(3)~equivariant~graph~neural~networks\\\\~$$](https://arxiv.org/pdf/2311.12657)', '[$$~~~~Careful~Selection~and~Thoughtful~Discarding:~Graph~Explicit~Pooling~Utilizing~Discarded~Nodes\\\\$$](https://arxiv.org/pdf/2311.12644)', '[$$~~~~Hierarchical~Joint~Graph~Learning~and~Multivariate~Time~Series~Forecasting$$](https://arxiv.org/pdf/2311.12630)', '[$$~~~~DeepTreeGAN:~Fast~Generation~of~High~Dimensional~Point~Clouds$$](https://arxiv.org/pdf/2311.12616)'], 'abstract': ['$$~Abstract:~~~~In~recent~years,~graph…~~~$$', '$$~Abstract:~~~~…An~important~part~of~this~process~is~to~predict~the\\\\~NMR~chemical~shift~from~the~molecular~structure.~This~wo

when you post a request to the server with a wrong Body, you will get a response like this:

In [32]:
wrong_params = {
    "key_word": {"sda":111}
}

response = requests.post(link_url, json=wrong_params)

json_info = custom_response_handler(response)
if not "error" in json_info:
    print(json_info)
else:
    print(json_info["error"])

ERROR:root:request body error[], status: request error


request error input params error: 
error 1:type: type_error.str, location: request body, param key_word input:None, msg: str type expected
error 2:type: type_error.list, location: request body, param key_word input:None, msg: value is not a valid list



if you post some params that may cause the backend internal error,it will be like:

In [None]:
wrong_params = {
    "key_word": "cv",
    "proxies":proxy,
}

response = requests.post(link_url, json=wrong_params)

json_info = custom_response_handler(response)
if not "error" in json_info:
    print(json_info)
else:
    print(json_info["error"])

## model_predict

服务器接受参数形式如下

```python
class Args(BaseModel):
    batchsize:int = get_batch_size()
    checkpoint:Union[str,Path] = "./pretrained_w"
    out:Union[str,Path] = './res'
    recompute:bool = False
    markdown:bool = True
    pdf:Union[List[str],List[Path],str,Path] = None


class PredictRequest(BaseModel):
    args:Args = Field(...,arbitrary_types_allowed=True)
    proxy:Union[Dict,None] = None
    headers:Union[Dict,None] = None
    ```

params标准形式为
```json
    params = {
        "args": _args, 
        "proxy": _proxy,
        "headers": _headers
    }
```

其中args:
```json
    { 
    'recompute': True, 
    'markdown': True, 
    'kw': 'QA', 
    'pdf': ['https://arxiv.org/pdf/2311.01449'], 
    'num_process': 3
    }

```
简易版：
```json
    { 
    'pdf': ['https://arxiv.org/pdf/2311.01449']
    }

```


得到的output形式为：
```json
{
    "article_ls":
        [
            "article_text1",
            "article_text2"
        ],
    "file_names":
        [
            "file_name1",
            "file_name2"
        ]
}
```

In [12]:
params = {
    "args": {
        "pdf":[
            "https://arxiv.org/pdf/2311.01449",
            "https://arxiv.org/pdf/2311.01403"
        ],
}
}


start_time = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
print(f'start time:{start_time}')

now_time = time.time()

model_url = req_url + '/model_predict/'
flag = False

response = requests.post(model_url, json=params)
json_info = custom_response_handler(response)
if not "error" in json_info:
    print(json_info)
else:
    print(json_info["error"])



start time:2023-11-22 10:36:09


INFO:root: process success


{'article_ls': ['# TopicGPT: A Prompt-based Topic Modeling Framework\n\nChau Minh Pham\\({}^{1}\\)   Alexander Hoyle\\({}^{2}\\)   Simeng Sun\\({}^{1}\\)   Mohit Iyyer\\({}^{1}\\)\n\n\\({}^{1}\\)University of Massachusetts Amherst  \\({}^{2}\\)University of Maryland\n\n{ctpham, simengsun, miyyer}@umass.edu,\n\nhoyle@umd.edu\n\n###### Abstract\n\nTopic modeling is a well-established technique for exploring text corpora. Conventional topic models (e.g., LDA) represent topics as bags of words that often require "reading the tea leaves" to interpret; additionally, they offer users minimal semantic control over topics. To tackle these issues, we introduce TopicGPT, a prompt-based framework that uses large language models (LLMs) to uncover latent topics within a provided text collection. TopicGPT produces topics that align better with human categorization compared to competing methods: for example, it achieves a harmonic mean purity of 0.74 against human-annotated Wikipedia topics compared t

if post params is wrong, it will return error message for each wrong key with all acceptable value types

In [17]:
params = {
    "args": {
        "pdf":  {
            "url": "https://arxiv.org/pdf/2311.01449",
            "title": "test"
        },
},
    "proxy":222

}


start_time = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
print(f'start time:{start_time}')

now_time = time.time()

model_url = req_url + '/model_predict/'
flag = False

response = requests.post(model_url, json=params)
json_info = custom_response_handler(response)
if not "error" in json_info:
    print(json_info)
else:
    print(json_info["error"])

ERROR:root:request body error[], status: request error


start time:2023-11-22 11:12:06
request error input params error: 
error 1:type: type_error.list, location: request body, param args input:None, msg: value is not a valid list
error 2:type: type_error.list, location: request body, param args input:None, msg: value is not a valid list
error 3:type: type_error.str, location: request body, param args input:None, msg: str type expected
error 4:type: type_error.path, location: request body, param args input:None, msg: value is not a valid path
error 5:type: type_error.dict, location: request body, param proxy input:None, msg: value is not a valid dict



### get_summaries 

服务器格式为：

```python
class SummaryRequest(BaseModel):
    openai_info:dict = ...
    artile_text:Union[str,None] = ...
    file_name:Union[str,None] = ...
    requests_per_minute:Union[int,None] = None
    proxy:Union[dict,None] = None
    summary_prompts: Union[dict, str] = gpt_prompts['section summary']
    resummry_prompts: Union[dict, str] = gpt_prompts["blog summary"]
    ignore_titles:Union[List,None] = ["references","appendix"]
    acquire_mode: Literal['url', 'openai'] = 'url'
    prompt_factor: float = 0.8  # prompt tokens / total tokens
    num_processes: int = 3
    base_url: str = "https://openai.huatuogpt.cn/v1"
    init_grid: int = 2
    split_mode: str = 'group'
    gpt_config: Union[basic_model_info,None] = None

    def __init__(self, **data):
        if data.get("gpt_config",None) is None:
            data["gpt_config"] = basic_model_info()
        super().__init__(**data)
```


返回结果格式：
```json
{
    "titles": "str:titles",
    "authors": "str:authors",
    "affiliations": "str:affiliations",
    "total_resp": "str:summary result",
    "re_respnse": "str:blog result or resummary result"
                
}
```


params标准格式：
```json
    params = {
        "openai_info": {
            "api_key": "sk-xxxx",
            "ignore_title":["references","appendix"],
            "  base_url": "https://openai.huatuogpt.cn/v1"
        },
        "artile_text": "str:article content",
        "file_name": "str:save name of file_name(to identity)",
        "requests_per_minute": 3, /* api key is limited by openai with 3/min,if not set None*/
        "proxy": proxy,
        "acquire_mode": "url or openai",
        "prompt_factor": "float:0.8, the max length of prompts / the max_tokens",
        "summary_prompts": "summary_prompts",
        "resummry_prompts": "resummry_prompts",
        "ignore_title":["references","appendix"],
        "num_processes": "int:3,the number of processing"
        "base_url": "https://openai.huatuogpt.cn/v1 or https://api.openai.com/v1",
        "init_grid": ":int =2,The title level of the subgroup",
        "split_mode": "group or half",
        "gpt_config": "{
                model: str = 'gpt-3.5-turbo-16k-0613'
                temperature:float = 0.9
                max_tokens: int = 16385
                top_p:float = 1
                frequency_penalty:float = 0
                presence_penalty:float = 0
        }"

    }
```

简易格式：
```json
    params = 
    {
        "openai_info": {
            "api_key": "sk-xxxx",
        },
        "artile_text": "str:article content",
        "file_name": "str:save name of file_name(to identity)"
    }
```

In [18]:
from typing import Union, List, Literal
from pydantic import BaseModel

start_time = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
print(f'start time:{start_time}')


summary_url = req_url + "/get_summaries/"
print('request url: ', summary_url)
text_path = './res/raw_mmd/2309_08182.mmd'
with open(text_path,'r',encoding='utf-8') as f:
    text = f.read()
params = {
    "openai_info":
    {
        "api_key":"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"
    },
    "artile_text":text,
    "file_name" : "2309_08182.mmd"
}

flag = False

now_time = time.time()
response = requests.post(summary_url, json=params)
json_info = custom_response_handler(response)
if not "error" in json_info:
    print(json_info)
else:
    print(json_info["error"])


start time:2023-11-22 11:14:22
request url:  http://61.241.103.32:5010/get_summaries/


INFO:root: process success


{'titles': '# Using Large Language Model to Solve and Explain Physics Word Problems Approaching Human Level', 'authors': 'Jingzhe DingYan Cen Xinyuan Wei{yancen,xinyuanwei}@fudan.edu.cn', 'affiliations': 'Columbia Universityjd4001@columbia.eduFudan University', 'total_resp': '\n## abstract:\nrequest openai error, SpawnPoolWorker-1:1 error code: Request Error: 401 Client Error: Authorization Required for url: https://api.ai-gaochao.cn/v1/chat/completions\n## intro:\nrequest openai error, SpawnPoolWorker-1:2 error code: Request Error: 401 Client Error: Authorization Required for url: https://api.ai-gaochao.cn/v1/chat/completions\n## related work:\nrequest openai error, SpawnPoolWorker-1:3 error code: Request Error: 401 Client Error: Authorization Required for url: https://api.ai-gaochao.cn/v1/chat/completions\n## dataset:\nrequest openai error, SpawnPoolWorker-1:3 error code: Request Error: 401 Client Error: Authorization Required for url: https://api.ai-gaochao.cn/v1/chat/completions\n#