Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some good practices from sicarator #17

Merged
merged 18 commits into from
Jul 25, 2023

Conversation

Wirg
Copy link
Collaborator

@Wirg Wirg commented Jul 24, 2023

No description provided.

@github-actions
Copy link

github-actions bot commented Jul 24, 2023

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/demo/main.py

The code is generally good, but there are areas for potential improvement.

  1. The api_key variable is declared as None at the beginning of the script but it's not used until much later in the code. It would be better to declare it just before it's used to improve readability.

  2. The api_key assignment could be simplified. Instead of checking if api_key is None or an empty string separately, you could use a truthy check to cover both cases.

  3. The try/except block around the openai.ChatCompletion.create call could be more specific. Catching Exception is too broad and could hide other unexpected issues. It would be better to catch specific exceptions that you expect could be raised by this function.

Here's how you could implement these changes:

# Use the user-provided API key if available,
# otherwise use the API key from the .env file
api_key = huggingface_api_key if model_name.startswith("hf") else openai_api_key

if not api_key:
    st.error("Please provide an API key")
    st.stop()

openai.api_key = api_key
openai.api_base = genoss_endpoint

try:
    response = openai.ChatCompletion.create(
        model=model_name,
        messages=st.session_state.messages,
    )
    msg = response.choices[0].message
except openai.error.OpenAIError as e:
    st.error(e)
    st.stop()

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/api/completions_routes.py

The code is generally good, but there are areas for potential improvement.

  1. The RequestBody class could be moved to a separate file to improve organization and readability.

  2. The post_chat_completions function could be simplified. The model variable is only used once, so you could call ModelFactory.get_model_from_name directly when you need it.

Here's how you could implement these changes:

from genoss.entities.request_body import RequestBody

@completions_router.post("/chat/completions", tags=["Chat Completions"])
async def post_chat_completions(
    body: RequestBody = Body(...),
    api_key: str = Depends(AuthHandler.check_auth_header, use_cache=False),
) -> dict[str, Any]:
    model = ModelFactory.get_model_from_name(body.model, api_key)

    if model is None:
        raise HTTPException(status_code=404, detail="Model not found")

    logger.info(
        f"Received chat completions request for {body.model} with messages {body.messages[-1].content}"
    )

    return model.generate_answer(body.messages[-1].content)

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/api/embeddings_routes.py

The code is generally good, but there are areas for potential improvement.

  1. The post_embeddings function could be improved. The gpt variable is only used once, so you could call Gpt4AllLLM directly when you need it.

  2. The function returns a hardcoded list of zeros if gpt is None. It would be better to raise an exception in this case to indicate that something went wrong.

Here's how you could implement these changes:

@embeddings_router.post("/embeddings", tags=["Embeddings"])
async def post_embeddings(
    model: str,
    input: str,
) -> list[float]:
    if model != "gpt4all":
        raise HTTPException(status_code=400, detail="Invalid model")

    response = Gpt4AllLLM(name="gpt4all").generate_embedding(input)

    return response

LOGAF Level 5 - /home/runner/work/GenossGPT/GenossGPT/genoss/api/misc_routes.py

The code is excellent and needs no changes.

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/auth/auth_handler.py

The code is generally good, but there are areas for potential improvement.

  1. The check_auth_header function could be simplified. The components variable is only used once, so you could split the authorization string directly when you need it.

Here's how you could implement these changes:

@staticmethod
async def check_auth_header(
    authorization: str | None = Header(None),
) -> str | None:
    if authorization is None:
        return None

    if len(authorization.split()) != 2 or authorization.split()[0].lower() != "bearer":
        raise HTTPException(status_code=403, detail="Invalid authorization header")

    return authorization.split()[1]

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/entities/chat/chat_completion.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/entities/chat/message.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/base_genoss.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/fake_llm.py

The code is generally good, but there are areas for potential improvement.

  1. The generate_answer function could be simplified. The llm variable is only used once, so you could call FakeListLLM directly when you need it.

Here's how you could implement these changes:

def generate_answer(self, question: str) -> dict[str, Any]:
    response_text = LLMChain(llm=FakeListLLM(responses=["Hello from FakeLLM!"]), prompt=prompt_template)(question)

    answer = response_text["text"]
    chat_completion = ChatCompletion(
        model=self.name, answer=answer, question=question
    )

    return chat_completion.to_dict()

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/hf_hub/base_hf_hub.py

The code is generally good, but there are areas for potential improvement.

  1. The generate_answer function could be simplified. The llm variable is only used once, so you could call HuggingFaceHub directly when you need it.

Here's how you could implement these changes:

def generate_answer(self, question: str) -> dict[str, Any]:
    response_text = LLMChain(prompt=prompt_template, llm=HuggingFaceHub(
        repo_id=self.repo_id, huggingfacehub_api_token=self.huggingfacehub_api_token
    ))(question)

    answer = response_text["text"]

    chat_completion = ChatCompletion(
        model=self.name, question=question, answer=answer
    )

    return chat_completion.to_dict()

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/hf_hub/falcon.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/hf_hub/gpt2.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/hf_hub/llama2.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/local/base_local.py

The code is generally good, but there are areas for potential improvement.

  1. The validate_model_path function could be improved. It currently raises a ValueError if the provided path does not exist. It would be better to raise a more specific exception, such as FileNotFoundError.

Here's how you could implement these changes:

@validator("model_path")
def validate_model_path(cls, v: str) -> str:
    if not Path(v).is_file():
        raise FileNotFoundError(f"The provided model path does not exist: {v}")
    return v

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/local/gpt4all.py

The code is generally good, but there are areas for potential improvement.

  1. The generate_answer function could be simplified. The llm variable is only used once, so you could call GPT4All directly when you need it.

Here's how you could implement these changes:

def generate_answer(self, question: str) -> dict[str, Any]:
    response_text = LLMChain(llm=GPT4All(model=self.model_path), prompt=prompt_template)(question)

    answer = response_text["text"]

    chat_completion = ChatCompletion(
        model=self.name, question=question, answer=answer
    )

    return chat_completion.to_dict()

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/llm/openai/openai_llm.py

The code is generally good, but there are areas for potential improvement.

  1. The generate_answer function could be simplified. The llm variable is only used once, so you could call ChatOpenAI directly when you need it.

Here's how you could implement these changes:

def generate_answer(self, question: str) -> dict[str, Any]:
    response_text = LLMChain(llm=ChatOpenAI(model_name=self.model_name, openai_api_key=self.openai_api_key), prompt=prompt_template)(question)

    answer = response_text["text"]
    chat_completion = ChatCompletion(
        model=self.name, answer=answer, question=question
    )

    return chat_completion.to_dict()

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/genoss/services/model_factory.py

The code is generally good, but there are areas for potential improvement.

  1. The get_model_from_name function could be simplified. The model variable is only used once, so you could return the result of the if statements directly.

Here's how you could implement these changes:

@staticmethod
def get_model_from_name(
    name: str, api_key: str | None = None
) -> BaseGenossLLM | None:
    if name.lower() in OPENAI_NAME_LIST:
        return OpenAILLM(model_name=name, api_key=api_key)
    if name.lower() == "gpt4all":
        return Gpt4AllLLM()
    if name.lower().startswith("hf-llama2"):
        return HuggingFaceHubLlama2LLM(api_key=api_key)
    if name.lower().startswith("hf-gpt2"):
        return HuggingFaceHubGPT2LLM(api_key=api_key)
    if name.lower().startswith("hf-falcon"):
        return HuggingFaceHubFalconLLM(api_key=api_key)
    elif name == FAKE_LLM_NAME:
        return FakeLLM()
    return None

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/logger.py

The code is high-quality and requires only minor tweaks.

LOGAF Level 3 - /home/runner/work/GenossGPT/GenossGPT/main.py

The code is generally good, but there are areas for potential improvement.

  1. The http_exception_handler and validation_exception_handler functions could be simplified. The exc variable is only used once, so you could return the JSONResponse directly.

Here's how you could implement these changes:

@app.exception_handler(HTTPException)
async def http_exception_handler(_, exc):
    return JSONResponse(
        status_code=exc.status_code,
        content={"detail": exc.detail},
    )


@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request: Request, exc: RequestValidationError):
    exc_str = f"{exc}".replace("\n", " ").replace("   ", " ")
    logging.error(f"{request}: {exc_str}")
    content = {"status_code": 10422, "message": exc_str, "data": None}
    logger.error(f"Received request: {request}")
    return JSONResponse(
        content=content, status_code=status.HTTP_422_UNPROCESSABLE_ENTITY
    )

LOGAF Level 4 - /home/runner/work/GenossGPT/GenossGPT/tests/services/test_model_factory.py

The code is high-quality and requires only minor tweaks.


👍🔧📚


Powered by Code Review GPT

@Wirg
Copy link
Collaborator Author

Wirg commented Jul 24, 2023

  • add requirements.txt -> poetry, flake8 -> ruff (more strict + supposed to be faster), unnitest -> pytest, pyright -> mypy

  • add tests in the CI

  • modify docker to use poetry and docker compose to work directly

  • ⚠️ Break GPT4ALL local : it becomes optional so I don't have issue with GPT4ALL inside docker on osx. In the future, we will probably go to gpt4all inside another docker service ?

  • ⚠️ remove weird scipy misc import ? (not used ?)

  • ⚠️ remove a lot of requirements that did not seem to be used

  • Some TODO have not been investigated

  • ⚠️ 🛑 we need new install instructions => ✅

  • ⚠️ 🛑 we need to test what might have been broken => ✅

Think :

  • cite Sicarator
  • auto generate requirements.txt ?

@Wirg Wirg marked this pull request as ready for review July 25, 2023 08:49
@Wirg Wirg merged commit e238743 into main Jul 25, 2023
7 checks passed
@Wirg Wirg deleted the add-some-good-practices-from-sicarator branch July 25, 2023 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants