Add some good practices from sicarator #17

Wirg · 2023-07-24T15:32:12Z

No description provided.

…ll temporary optional

github-actions · 2023-07-24T15:33:39Z

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/demo/main.py

The code is generally good, but there are areas for potential improvement.

The api_key variable is declared as None at the beginning of the script but it's not used until much later in the code. It would be better to declare it just before it's used to improve readability.
The api_key assignment could be simplified. Instead of checking if api_key is None or an empty string separately, you could use a truthy check to cover both cases.
The try/except block around the openai.ChatCompletion.create call could be more specific. Catching Exception is too broad and could hide other unexpected issues. It would be better to catch specific exceptions that you expect could be raised by this function.

Here's how you could implement these changes:

# Use the user-provided API key if available,
# otherwise use the API key from the .env file
api_key = huggingface_api_key if model_name.startswith("hf") else openai_api_key

if not api_key:
    st.error("Please provide an API key")
    st.stop()

openai.api_key = api_key
openai.api_base = genoss_endpoint

try:
    response = openai.ChatCompletion.create(
        model=model_name,
        messages=st.session_state.messages,
    )
    msg = response.choices[0].message
except openai.error.OpenAIError as e:
    st.error(e)
    st.stop()

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/api/completions_routes.py

The code is generally good, but there are areas for potential improvement.

The RequestBody class could be moved to a separate file to improve organization and readability.
The post_chat_completions function could be simplified. The model variable is only used once, so you could call ModelFactory.get_model_from_name directly when you need it.

Here's how you could implement these changes:

from genoss.entities.request_body import RequestBody

@completions_router.post("/chat/completions", tags=["Chat Completions"])
async def post_chat_completions(
    body: RequestBody = Body(...),
    api_key: str = Depends(AuthHandler.check_auth_header, use_cache=False),
) -> dict[str, Any]:
    model = ModelFactory.get_model_from_name(body.model, api_key)

    if model is None:
        raise HTTPException(status_code=404, detail="Model not found")

    logger.info(
        f"Received chat completions request for {body.model} with messages {body.messages[-1].content}"
    )

    return model.generate_answer(body.messages[-1].content)

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/api/embeddings_routes.py

The code is generally good, but there are areas for potential improvement.

The post_embeddings function could be improved. The gpt variable is only used once, so you could call Gpt4AllLLM directly when you need it.
The function returns a hardcoded list of zeros if gpt is None. It would be better to raise an exception in this case to indicate that something went wrong.

Here's how you could implement these changes:

@embeddings_router.post("/embeddings", tags=["Embeddings"])
async def post_embeddings(
    model: str,
    input: str,
) -> list[float]:
    if model != "gpt4all":
        raise HTTPException(status_code=400, detail="Invalid model")

    response = Gpt4AllLLM(name="gpt4all").generate_embedding(input)

    return response

LOGAF Level 5 - /home/runner/work/GenossGPT/GenossGPT/genoss/api/misc_routes.py

The code is excellent and needs no changes.

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/auth/auth_handler.py

The code is generally good, but there are areas for potential improvement.

The check_auth_header function could be simplified. The components variable is only used once, so you could split the authorization string directly when you need it.

Here's how you could implement these changes:

@staticmethod
async def check_auth_header(
    authorization: str | None = Header(None),
) -> str | None:
    if authorization is None:
        return None

    if len(authorization.split()) != 2 or authorization.split()[0].lower() != "bearer":
        raise HTTPException(status_code=403, detail="Invalid authorization header")

    return authorization.split()[1]

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/entities/chat/chat_completion.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/entities/chat/message.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/base_genoss.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/fake_llm.py

The code is generally good, but there are areas for potential improvement.

The generate_answer function could be simplified. The llm variable is only used once, so you could call FakeListLLM directly when you need it.

Here's how you could implement these changes:

def generate_answer(self, question: str) -> dict[str, Any]:
    response_text = LLMChain(llm=FakeListLLM(responses=["Hello from FakeLLM!"]), prompt=prompt_template)(question)

    answer = response_text["text"]
    chat_completion = ChatCompletion(
        model=self.name, answer=answer, question=question
    )

    return chat_completion.to_dict()

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/hf_hub/base_hf_hub.py

The code is generally good, but there are areas for potential improvement.

The generate_answer function could be simplified. The llm variable is only used once, so you could call HuggingFaceHub directly when you need it.

Here's how you could implement these changes:

def generate_answer(self, question: str) -> dict[str, Any]:
    response_text = LLMChain(prompt=prompt_template, llm=HuggingFaceHub(
        repo_id=self.repo_id, huggingfacehub_api_token=self.huggingfacehub_api_token
    ))(question)

    answer = response_text["text"]

    chat_completion = ChatCompletion(
        model=self.name, question=question, answer=answer
    )

    return chat_completion.to_dict()

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/hf_hub/falcon.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/hf_hub/gpt2.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/hf_hub/llama2.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/local/base_local.py

The code is generally good, but there are areas for potential improvement.

The validate_model_path function could be improved. It currently raises a ValueError if the provided path does not exist. It would be better to raise a more specific exception, such as FileNotFoundError.

Here's how you could implement these changes:

@validator("model_path")
def validate_model_path(cls, v: str) -> str:
    if not Path(v).is_file():
        raise FileNotFoundError(f"The provided model path does not exist: {v}")
    return v

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/local/gpt4all.py

The code is generally good, but there are areas for potential improvement.

The generate_answer function could be simplified. The llm variable is only used once, so you could call GPT4All directly when you need it.

Here's how you could implement these changes:

def generate_answer(self, question: str) -> dict[str, Any]:
    response_text = LLMChain(llm=GPT4All(model=self.model_path), prompt=prompt_template)(question)

    answer = response_text["text"]

    chat_completion = ChatCompletion(
        model=self.name, question=question, answer=answer
    )

    return chat_completion.to_dict()

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/openai/openai_llm.py

The code is generally good, but there are areas for potential improvement.

The generate_answer function could be simplified. The llm variable is only used once, so you could call ChatOpenAI directly when you need it.

Here's how you could implement these changes:

def generate_answer(self, question: str) -> dict[str, Any]:
    response_text = LLMChain(llm=ChatOpenAI(model_name=self.model_name, openai_api_key=self.openai_api_key), prompt=prompt_template)(question)

    answer = response_text["text"]
    chat_completion = ChatCompletion(
        model=self.name, answer=answer, question=question
    )

    return chat_completion.to_dict()

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/services/model_factory.py

The code is generally good, but there are areas for potential improvement.

The get_model_from_name function could be simplified. The model variable is only used once, so you could return the result of the if statements directly.

Here's how you could implement these changes:

@staticmethod
def get_model_from_name(
    name: str, api_key: str | None = None
) -> BaseGenossLLM | None:
    if name.lower() in OPENAI_NAME_LIST:
        return OpenAILLM(model_name=name, api_key=api_key)
    if name.lower() == "gpt4all":
        return Gpt4AllLLM()
    if name.lower().startswith("hf-llama2"):
        return HuggingFaceHubLlama2LLM(api_key=api_key)
    if name.lower().startswith("hf-gpt2"):
        return HuggingFaceHubGPT2LLM(api_key=api_key)
    if name.lower().startswith("hf-falcon"):
        return HuggingFaceHubFalconLLM(api_key=api_key)
    elif name == FAKE_LLM_NAME:
        return FakeLLM()
    return None

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/logger.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/main.py

The code is generally good, but there are areas for potential improvement.

The http_exception_handler and validation_exception_handler functions could be simplified. The exc variable is only used once, so you could return the JSONResponse directly.

Here's how you could implement these changes:

@app.exception_handler(HTTPException)
async def http_exception_handler(_, exc):
    return JSONResponse(
        status_code=exc.status_code,
        content={"detail": exc.detail},
    )


@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request: Request, exc: RequestValidationError):
    exc_str = f"{exc}".replace("\n", " ").replace("   ", " ")
    logging.error(f"{request}: {exc_str}")
    content = {"status_code": 10422, "message": exc_str, "data": None}
    logger.error(f"Received request: {request}")
    return JSONResponse(
        content=content, status_code=status.HTTP_422_UNPROCESSABLE_ENTITY
    )

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/tests/services/test_model_factory.py

The code is high-quality and requires only minor tweaks.

👍🔧📚

Powered by Code Review GPT

Wirg · 2023-07-24T16:54:29Z

add requirements.txt -> poetry, flake8 -> ruff (more strict + supposed to be faster), unnitest -> pytest, pyright -> mypy
add tests in the CI
modify docker to use poetry and docker compose to work directly
⚠️ Break GPT4ALL local : it becomes optional so I don't have issue with GPT4ALL inside docker on osx. In the future, we will probably go to gpt4all inside another docker service ?
⚠️ remove weird scipy misc import ? (not used ?)
⚠️ remove a lot of requirements that did not seem to be used
Some TODO have not been investigated
⚠️ 🛑 we need new install instructions => ✅
⚠️ 🛑 we need to test what might have been broken => ✅

Think :

cite Sicarator
auto generate requirements.txt ?

Arnault Chazareix added 9 commits July 24, 2023 15:13

chore(): update gitignore

5bd1404

feat(): add poetry and precommit

b19b0c2

feat(): make dockerfile compatible with poetry. Warning: making gpt4a…

d0e0adf

…ll temporary optional

feat(): add ci and fix weird scipy import

97ca7b5

chore(): isort and remove requirements.txt

8de19c1

feat(): add base docker compose

9d3724d

feat(): add streamlit demo group

b53e43d

fix(): add pythonpath to pytest config

4cd35bf

fix(quality): decrease coverage threshold

eb81cc5

Arnault Chazareix added 2 commits July 24, 2023 18:36

chore(): solve some quality issues

4224210

quality(): change tests to pytest and use mark parametrize

b636bfc

StanGirard approved these changes Jul 24, 2023

View reviewed changes

Arnault Chazareix added 4 commits July 24, 2023 20:19

feat(): adding watchdog to improve streamlit watcher for demo

2ca6da5

doc(): add initial contributing.md, will need rework

54a78fb

fix(): deps to have openai working

7d8506b

fix(demo): make demo non blocking and clearer on crashes

54cad10

Wirg marked this pull request as ready for review July 25, 2023 08:49

Arnault Chazareix added 3 commits July 25, 2023 11:26

fix(): improve readme doc

d6c82d6

fix(): move image asssets to another folder

3bb2065

chore(): add demo/.env to gitignore

472ae58

Wirg merged commit e238743 into main Jul 25, 2023
7 checks passed

Wirg deleted the add-some-good-practices-from-sicarator branch July 25, 2023 09:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add some good practices from sicarator #17

Add some good practices from sicarator #17

Wirg commented Jul 24, 2023

github-actions bot commented Jul 24, 2023 •

edited

Wirg commented Jul 24, 2023 •

edited

Add some good practices from sicarator #17

Add some good practices from sicarator #17

Conversation

Wirg commented Jul 24, 2023

github-actions bot commented Jul 24, 2023 • edited

Powered by Code Review GPT

Wirg commented Jul 24, 2023 • edited

github-actions bot commented Jul 24, 2023 •

edited

Wirg commented Jul 24, 2023 •

edited