Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

54 refactor split out client #59

Merged
merged 12 commits into from
May 28, 2024
Merged

54 refactor split out client #59

merged 12 commits into from
May 28, 2024

Conversation

madeddie
Copy link
Owner

@madeddie madeddie commented May 26, 2024

Also removing the src layer since I find it annoying, might put it back at a later date.

Summary by CodeRabbit

  • New Features

    • Introduced a client class to interact with a remote server for task management.
    • Added functionality for logging in, retrieving task lists, getting task details, and submitting tasks.
    • Implemented a Python script for interacting with the mooc.fi CSES instance, including logging in, listing exercises, showing exercise details, and submitting solutions.
  • Tests

    • Added test cases for the Client class to ensure reliable task management and interaction with the server.

@madeddie madeddie linked an issue May 26, 2024 that may be closed by this pull request
Copy link
Contributor

coderabbitai bot commented May 26, 2024

Walkthrough

The recent updates enhance the tyora package by refining its structure and expanding its capabilities. The package declaration was simplified, and extensive test cases were introduced for the Client class. New functionalities include task management, configuration handling, and task submission processes. These changes collectively improve the package's usability and robustness, especially in interacting with the mooc.fi CSES instance.

Changes

File Change Summary
pyproject.toml Updated package declaration from "src/tyora" to "tyora" within the tool.hatch.build.targets.wheel section.
tests/test_client.py Introduced test cases for the Client class, covering task retrieval, task state management, and task submission.
tyora/client.py Added Client class, Task dataclass, TaskState enum, and functions for task management and server interaction.
tyora/tyora.py Introduced script functionalities for logging in, listing exercises, showing details, submitting solutions, and managing configurations.

Poem

In the realm of code so bright,
tyora gains new might.
Tasks and tests now dance in line,
Configs and clients intertwine.
Through each function, purpose clear,
Progress made, we all cheer.
🎉🐇✨


Tip

Early Access Features
  • gpt-4o model for chat

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to full the review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

PR Review 🔍

⏱️ Estimated effort to review [1-5]

4, because the PR involves a significant refactor, including the introduction of new classes, reorganization of existing code into new modules, and changes to session handling. The changes are spread across multiple files and involve both logic and structural adjustments, which require careful review to ensure functionality and integration are maintained.

🧪 Relevant tests

Yes

⚡ Possible issues

Possible Bug: The removal of username and password from the session constructor and their addition to the login method might cause issues if not handled correctly in all instances where the session is used.

Refactoring Risk: Extensive changes in code structure could lead to unforeseen bugs if not all dependencies and interactions are correctly updated and tested.

🔒 Security concerns

No

Copy link

PR Code Suggestions ✨

CategorySuggestion                                                                                                                                    Score
Possible issue
Add a check to ensure the filename is provided before attempting to open the file

Add a check to ensure the args.filename is not None before attempting to open the file in
the submit command section.

tyora/tyora.py [228-229]

+if not args.filename:
+    print("Filename is required for submission.")
+    sys.exit(1)
 with open(args.filename, "rb") as f:
     submission_code = f.read()
 
Suggestion importance[1-10]: 10

Why: This suggestion is crucial as it prevents a potential runtime error when args.filename is None, which would lead to an unhandled exception when trying to open a file that doesn't exist.

10
Ensure the submit link is present before making the GET request

Add a check to ensure task.submit_link is not None before making the GET request in the
submit_task method.

tyora/client.py [66-67]

+if not task.submit_link:
+    raise ValueError("No submit link found for the task.")
 res = self.session.get(urljoin(self.session.base_url, task.submit_link))
 
Suggestion importance[1-10]: 8

Why: This suggestion is important for robustness. It prevents making a GET request with an invalid URL, which would result in a runtime error if task.submit_link is None.

8
Ensure the login form action URL is present before making the POST request

Add a check to ensure the login form action URL is not None before making the POST request
in the login method.

tyora/session.py [78-79]

+if not action:
+    raise ValueError("Login form action URL not found.")
 self.post(
     url=urljoin(res.url, action),
 
Suggestion importance[1-10]: 8

Why: This suggestion is important to ensure that the POST request in the login method is made to a valid URL. Without this check, the code might attempt to make a request to an invalid URL, leading to errors.

8
Ensure the task ID is a valid numerical identifier before proceeding

Add a check to ensure the task_id is a valid numerical identifier before proceeding in the
show command section.

tyora/tyora.py [224]

+if not args.task_id.isdigit():
+    print("Invalid task ID. It must be a numerical identifier.")
+    sys.exit(1)
 print_task(client.get_task(args.task_id))
 
Suggestion importance[1-10]: 7

Why: This suggestion improves the robustness of the application by ensuring that the task_id used in the show command is a valid numerical identifier, thus preventing potential errors or misbehavior if an invalid task_id is provided.

7
Security
Use environment variables to store sensitive information like usernames and passwords

To ensure that the username and password parameters are not hardcoded, consider using
environment variables or a configuration file to store these sensitive details. This will
enhance security and make the tests more flexible.

tests/test_session.py [23]

-mock_session.login(username="test_user@test.com", password="test_password")
+import os
+...
+mock_session.login(username=os.getenv("TEST_USERNAME"), password=os.getenv("TEST_PASSWORD"))
 
Suggestion importance[1-10]: 9

Why: The suggestion correctly identifies a security improvement by replacing hardcoded credentials with environment variables, which is crucial for maintaining security best practices.

9
Best practice
Replace print statements with logging for better control over debug output

Instead of using print statements for debugging, consider using logging with an
appropriate logging level. This will provide more control over the output and can be
easily turned on or off.

tests/test_session.py [24]

-print(mock_session.get("https://example.com/list").text)
+import logging
+...
+logging.debug(mock_session.get("https://example.com/list").text)
 
Suggestion importance[1-10]: 7

Why: Replacing print statements with logging is a good practice for better control and maintainability of output in production and testing environments.

7
Maintainability
Add a comment to explain the purpose of the if __name__ == "__main__": block

Add a brief comment explaining the purpose of the if name == "main": block, which
will help other developers understand why this block is necessary.

tyora/main.py [3-4]

+# This block ensures that the main function is called only when this script is executed directly.
 if __name__ == "__main__":
     tyora.main()
 
Suggestion importance[1-10]: 5

Why: Adding a comment for clarity is helpful for maintainability, although it's a minor improvement and not crucial to functionality.

5

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Outside diff range and nitpick comments (1)
tyora/session.py (1)

Line range hint 41-77: The login method is well-refactored to handle the login process using direct credentials and HTML form parsing. Consider adding error handling for potential network issues during the login steps.

try:
    # existing login code
except requests.exceptions.RequestException as e:
    logger.error(f"Network error during login: {e}")
    raise
Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 717a10d and 366d792.
Files selected for processing (8)
  • pyproject.toml (1 hunks)
  • tests/test_session.py (3 hunks)
  • tests/test_tyora.py (1 hunks)
  • tests/test_utils.py (1 hunks)
  • tyora/client.py (1 hunks)
  • tyora/session.py (5 hunks)
  • tyora/tyora.py (1 hunks)
  • tyora/utils.py (1 hunks)
Files skipped from review due to trivial changes (2)
  • pyproject.toml
  • tests/test_tyora.py
Additional comments not posted (15)
tests/test_session.py (3)

23-23: The test for successful login correctly uses the updated login method signature and properly mocks the HTTP responses.


44-44: The test for failed login is well-implemented, correctly anticipating a ValueError when login credentials are incorrect.


44-44: The cookie handling test is straightforward and effectively checks the session's cookie management.

tests/test_utils.py (2)

15-22: The tests in TestFindLink cover both successful and unsuccessful scenarios effectively, ensuring the find_link function behaves as expected under various conditions.


36-46: The tests in TestParseForm are comprehensive, covering cases with and without forms and inputs, ensuring robust behavior of the parse_form function.

tyora/session.py (2)

38-39: The is_logged_in method correctly uses the find_link function to check for a logout link, which is a reliable indicator of session status.


22-22: The constructor of MoocfiCsesSession is correctly set up to initialize the session with the provided base URL and cookies.

tyora/client.py (2)

85-127: The parse_task_list function effectively parses HTML to construct a list of Task objects, ensuring each task is correctly identified and categorized.


130-174: The parse_task function is thorough in extracting detailed information about tasks from HTML, ensuring all relevant data is captured and properly formatted.

tyora/tyora.py (6)

32-89: The parse_args function is well-implemented, providing a comprehensive set of command-line options and handling different commands effectively.


92-100: The create_config function correctly prompts for user credentials and constructs the configuration dictionary.


115-122: The read_config function effectively reads the configuration from a file and checks for the presence of essential settings.


125-140: The read_cookie_file function is robust, handling potential errors such as file not found and JSON decoding issues effectively.


161-171: The print_task_list function is well-implemented, correctly applying filters and limits to the list of tasks displayed.


173-177: The print_task function effectively displays detailed information about a task, including its description and submission file name.

tyora/utils.py Outdated
Comment on lines 6 to 16
def find_link(html: AnyStr, xpath: str) -> dict[str, Optional[str]]:
"""Search for html link by xpath and return dict with href and text"""
anchor_element = html5lib.parse(html, namespaceHTMLElements=False).find(xpath)
if anchor_element is None:
return dict()

link_data = dict()
link_data["href"] = anchor_element.get("href")
link_data["text"] = anchor_element.text

return link_data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of find_link looks good. Consider adding unit tests to cover edge cases, such as malformed HTML or incorrect XPath expressions.

Would you like me to help with writing these tests?

tyora/utils.py Outdated
Comment on lines 19 to 32
def parse_form(html: AnyStr, xpath: str = ".//form") -> dict[str, Optional[str]]:
"""Search for the first form in html and return dict with action and all other found inputs"""
form_element = html5lib.parse(html, namespaceHTMLElements=False).find(xpath)
if form_element is None:
return dict()

form_data = dict()
form_data["_action"] = form_element.get("action")
for form_input in form_element.iter("input"):
form_key = form_input.get("name") or ""
form_value = form_input.get("value") or ""
form_data[form_key] = form_value

return form_data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of parse_form is robust and handles cases where the form or inputs are missing. Consider adding unit tests to ensure it handles various HTML structures and edge cases correctly.

Would you like assistance in creating these tests?

tyora/client.py Outdated
Comment on lines 34 to 83
class Client:
def __init__(self, session: Session) -> None:
self.session = session

def login(self, username: str, password: str) -> None:
self.session.login(username, password)

def get_task_list(self) -> list[Task]:
res = self.session.get(urljoin(self.session.base_url, "list"))
res.raise_for_status()
return parse_task_list(res.text)

def get_task(self, task_id: str) -> Task:
res = self.session.get(urljoin(self.session.base_url, f"task/{task_id}"))
res.raise_for_status()
try:
task = parse_task(res.text)
except ValueError as e:
logger.debug(f"Error parsing task: {e}")
raise
return task

def submit_task(
self, task_id: str, submission: AnyStr, filename: Optional[str]
) -> str:
task = self.get_task(task_id)
if not task.submit_file and not filename:
raise ValueError("No submission filename found")
if not task.submit_link:
raise ValueError("No submit link found")
submit_file = task.submit_file or filename

res = self.session.get(urljoin(self.session.base_url, task.submit_link))
res.raise_for_status()
parsed_form_data = parse_form(res.text)
action = parsed_form_data.pop("_action")

submit_form_data = dict()
for key, value in parsed_form_data.items():
submit_form_data[key] = (None, value)
submit_form_data["file"] = (submit_file, submission)
submit_form_data["lang"] = (None, "Python3")
submit_form_data["option"] = (None, "CPython3")
res = self.session.post(
urljoin(self.session.base_url, action), files=submit_form_data
)
res.raise_for_status()

return res.url

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Client class is well-designed to handle server interactions for tasks. It uses the session effectively for HTTP requests. Consider adding error handling for network issues to improve robustness.

try:
    # existing client code
except requests.exceptions.RequestException as e:
    logger.error(f"Network error during client operations: {e}")
    raise

Comment on lines +103 to +112
def write_config(configfile: str, config: dict[str, str]) -> None:
file_path = Path(configfile).expanduser()
if file_path.exists():
# TODO: https://github.com/madeddie/tyora/issues/28
...
# Ensure directory exists
file_path.parent.mkdir(parents=True, exist_ok=True)
print("Writing config to file")
with open(file_path, "w") as f:
json.dump(config, f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The write_config function is well-implemented, ensuring the configuration directory exists before writing. Consider adding error handling for file write operations to improve robustness.

try:
    with open(file_path, "w") as f:
        json.dump(config, f)
except IOError as e:
    logger.error(f"Failed to write config to {file_path}: {e}")
    raise

tyora/tyora.py Outdated
Comment on lines 179 to 251
def main() -> None:
args = parse_args()

logging.basicConfig(
level=logging.DEBUG if args.debug else logging.WARNING,
format="%(asctime)s %(levelname)s %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)

if args.cmd == "login":
config = create_config()
write_config(args.config, config)
return

config = read_config(args.config)

# Merge cli args and configfile parameters in one dict
config.update((k, v) for k, v in vars(args).items() if v is not None)

base_url = f"https://cses.fi/{config['course']}/"

cookiefile = None
cookies = dict()
if not args.no_state:
if not STATE_DIR.exists():
STATE_DIR.mkdir(parents=True, exist_ok=True)
cookiefile = STATE_DIR / "cookies.json"
cookies = read_cookie_file(str(cookiefile))

session = Session(
base_url=base_url,
cookies=cookies,
)

client = Client(session)
client.login(username=config["username"], password=config["password"])

if not args.no_state and cookiefile:
cookies = session.cookies.get_dict()
write_cookie_file(str(cookiefile), cookies)

if args.cmd == "list":
print_task_list(client.get_task_list(), filter=args.filter, limit=args.limit)

if args.cmd == "show":
print_task(client.get_task(args.task_id))

if args.cmd == "submit":
# TODO allow user to paste the code in or even pipe it in
with open(args.filename, "rb") as f:
submission_code = f.read()

result_url = client.submit_task(
task_id=args.task_id,
filename=args.filename,
submission=submission_code,
)
print("Waiting for test results.", end="")
while True:
print(".", end="")
res = session.get(result_url)
print(res.text)
res.raise_for_status()
if "Test report" in res.text:
break
sleep(1)

print()
results = parse_submit_result(res.text)

print(f"Submission status: {results['status']}")
print(f"Submission result: {results['result']}")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main function is well-structured, handling different commands based on user input effectively. Consider adding error handling for network issues to ensure the script is robust under various conditions.

try:
    # existing main function code
except requests.exceptions.RequestException as e:
    logger.error(f"Network error during operations: {e}")
    raise

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 366d792 and 4ab8b04.
Files ignored due to path filters (2)
  • requirements-dev.lock is excluded by !**/*.lock
  • requirements.lock is excluded by !**/*.lock
Files selected for processing (7)
  • pyproject.toml (1 hunks)
  • tests/test_session.py (4 hunks)
  • tests/test_tyora.py (1 hunks)
  • tyora/client.py (1 hunks)
  • tyora/session.py (5 hunks)
  • tyora/tyora.py (1 hunks)
  • tyora/utils.py (1 hunks)
Files skipped from review as they are similar to previous changes (5)
  • pyproject.toml
  • tests/test_session.py
  • tests/test_tyora.py
  • tyora/tyora.py
  • tyora/utils.py
Additional comments not posted (6)
tyora/session.py (2)

23-23: Constructor setup looks good and is well-documented.


40-41: Method to check login status is correctly implemented.

tyora/client.py (4)

39-41: Login method correctly delegates to the session object.


42-45: Method to retrieve task list is implemented correctly.


47-55: Error handling in the task retrieval method is robust.


87-176: Utility functions for parsing tasks are well-implemented.

tyora/session.py Outdated Show resolved Hide resolved
tyora/client.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 4ab8b04 and 3c16e7a.
Files selected for processing (7)
  • tests/test_client.py (1 hunks)
  • tests/test_data/session_logged_in_some_tasks_done.html (1 hunks)
  • tests/test_data/task_3052_incomplete_no_submit_link.html (1 hunks)
  • tests/test_data/task_3055_complete.html (1 hunks)
  • tyora/client.py (1 hunks)
  • tyora/session.py (5 hunks)
  • tyora/tyora.py (1 hunks)
Files skipped from review due to trivial changes (2)
  • tests/test_data/task_3052_incomplete_no_submit_link.html
  • tests/test_data/task_3055_complete.html
Files skipped from review as they are similar to previous changes (1)
  • tyora/tyora.py
Additional comments not posted (9)
tests/test_data/session_logged_in_some_tasks_done.html (1)

40-40: The updated task list in the HTML content aligns with the changes described in the PR summary.

tyora/session.py (2)

23-23: The updated constructor aligns with the PR's objective to simplify session management by reducing the parameters.


Line range hint 43-80: The refactoring of the login method improves readability and maintainability. The handling of the _action variable is now more explicit, addressing previous concerns.

tests/test_client.py (3)

24-45: The test case for retrieving the task list is well-implemented, with correct assertions that align with the expected task details.


48-65: The test case for retrieving a complete task is thorough, with detailed assertions that correctly validate the task's properties and submission link.


67-85: The test case for an incomplete task without a submit link is correctly implemented, with appropriate assertions for the task's properties.

tyora/client.py (3)

39-42: The get_task_list method is correctly implemented, using the session to retrieve and parse the task list.


44-52: The get_task method is well-implemented, with appropriate error handling and logging for task retrieval.


54-80: The submit_task method is robust, correctly handling the submission process, including checks for necessary submission details and handling form data.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 3c16e7a and cd21193.
Files selected for processing (4)
  • pyproject.toml (1 hunks)
  • tests/test_client.py (1 hunks)
  • tyora/client.py (1 hunks)
  • tyora/tyora.py (1 hunks)
Files skipped from review as they are similar to previous changes (3)
  • pyproject.toml
  • tests/test_client.py
  • tyora/tyora.py
Additional comments not posted (3)
tyora/client.py (3)

19-22: The TaskState enum is well-defined and clear.


24-33: The Task dataclass is well-structured and appropriately uses optional fields for properties that may not always be present.


59-61: The error messages have been improved to include the task ID, which enhances the clarity and debuggability of the code.

Comment on lines +35 to +81
class Client:
def __init__(self, session: Session) -> None:
self.session = session

def get_task_list(self) -> list[Task]:
res = self.session.get(urljoin(self.session.base_url, "list"))
res.raise_for_status()
return parse_task_list(res.text)

def get_task(self, task_id: str) -> Task:
res = self.session.get(urljoin(self.session.base_url, f"task/{task_id}"))
res.raise_for_status()
try:
task = parse_task(res.text)
except ValueError as e:
logger.debug(f"Error parsing task: {e}")
raise
return task

def submit_task(
self, task_id: str, submission: str, filename: Optional[str]
) -> str:
task = self.get_task(task_id)
if not task.submit_file and not filename:
raise ValueError("No submission filename found for task ID: " + task_id)
if not task.submit_link:
raise ValueError("No submit link found for task ID: " + task_id)
submit_file = task.submit_file or filename

res = self.session.get(urljoin(self.session.base_url, task.submit_link))
res.raise_for_status()
parsed_form_data = parse_form(res.text)
action = parsed_form_data.pop("_action")

submit_form_data: dict[str, tuple[Optional[str], Optional[str]]] = dict()
for key, value in parsed_form_data.items():
submit_form_data[key] = (None, value)
submit_form_data["file"] = (submit_file, submission)
submit_form_data["lang"] = (None, "Python3")
submit_form_data["option"] = (None, "CPython3")
res = self.session.post(
urljoin(self.session.base_url, action),
files=submit_form_data, # type: ignore[arg-type]
)
res.raise_for_status()
return res.url

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Client class methods are well-organized and handle server interactions effectively. However, consider adding error handling for network issues to improve robustness, as previously suggested.

Would you like me to add the error handling code for network issues as previously suggested?

Comment on lines +83 to +125
def parse_task_list(html: AnyStr) -> list[Task]:
"""Parse html to find tasks and their status, returns list of Task objects"""
root = html5lib.parse(html, namespaceHTMLElements=False) # type: ignore[reportUnknownMemberType]
task_element_list = root.findall('.//li[@class="task"]')

task_list: list[Task] = list()
for task_element in task_element_list:
task_id = None
task_name = None
task_state = None

task_link = task_element.find("a")
if task_link is None:
continue

task_name = task_link.text
task_id = task_link.get("href", "/").split("/")[-1]
if not task_name or not task_id:
continue

task_element_spans = task_element.findall("span")
if not task_element_spans:
continue

task_element_span = next(
(span for span in task_element_spans if span.get("class", "") != "detail"),
Element("span"),
)
task_element_class = task_element_span.get("class") or ""

task_state = (
TaskState.COMPLETE if "full" in task_element_class else TaskState.INCOMPLETE
)

task = Task(
id=task_id,
name=task_name,
state=task_state,
)
task_list.append(task)

return task_list

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parse_task_list function is comprehensive and handles various edge cases well. However, consider adding error handling for unexpected HTML structures to prevent runtime errors.

Would you like me to add error handling for unexpected HTML structures in the parse_task_list function?

Comment on lines +127 to +171
def parse_task(html: AnyStr) -> Task:
root = html5lib.parse(html, namespaceHTMLElements=False) # type: ignore[reportUnknownMemberType]
task_link_element = root.find('.//div[@class="nav sidebar"]/a[@class="current"]')
task_link = task_link_element if task_link_element is not None else Element("a")
task_id = task_link.get("href", "").split("/")[-1]
if not task_id:
raise ValueError("Failed to find task id")
task_name = task_link.text or None
if not task_name:
raise ValueError("Failed to find task name")
task_span_element = task_link.find("span")
task_span = task_span_element if task_span_element is not None else Element("span")
task_span_class = task_span.get("class", "")
desc_div_element = root.find('.//div[@class="md"]')
desc_div = desc_div_element if desc_div_element is not None else Element("div")
description = html2text(tostring(desc_div).decode("utf8"))
code = root.findtext(".//pre", None)
submit_link_element = root.find('.//a[.="Submit"]')
submit_link = (
submit_link_element.get("href", None)
if submit_link_element is not None
else None
)

submit_file = next(
iter(
[
code_element.text
for code_element in root.findall(".//code")
if code_element.text is not None and ".py" in code_element.text
]
),
None,
)
task = Task(
id=task_id,
name=task_name,
state=TaskState.COMPLETE if "full" in task_span_class else TaskState.INCOMPLETE,
description=description.strip(),
code=code,
submit_file=submit_file,
submit_link=submit_link,
)

return task
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parse_task function is detailed and robust, extracting various task properties from HTML. However, the use of html5lib without proper error handling could lead to issues if the HTML structure changes.

Would you like me to add error handling for unexpected HTML structures in the parse_task function?

Comment on lines +174 to +192
# TODO test with failing results
# Seems to be broken since the switch to html5lib, needs tests!
def parse_submit_result(html: AnyStr) -> dict[str, str]:
root = html5lib.parse(html, namespaceHTMLElements=False) # type: ignore[reportUnknownMemberType]
submit_status_element = root.find('.//td[.="Status:"]/..') or Element("td")
submit_status_span_element = submit_status_element.find("td/span") or Element(
"span"
)
submit_status = submit_status_span_element.text or ""
submit_result_element = root.find('.//td[.="Result:"]/..') or Element("td")
submit_result_span_element = submit_result_element.find("td/span") or Element(
"span"
)
submit_result = submit_result_span_element.text or ""

return {
"status": submit_status.lower(),
"result": submit_result.lower(),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parse_submit_result function appears to be a work in progress, as indicated by the TODO comment. It's crucial to add tests to ensure its functionality, especially since the switch to html5lib.

Would you like me to help with writing tests for the parse_submit_result function?

The flow is not completed in the test, to be revisited
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between cd21193 and 659245b.
Files selected for processing (4)
  • tests/test_client.py (1 hunks)
  • tests/test_data/submit_3055_form.html (1 hunks)
  • tests/test_data/task_3052_incomplete_no_submit_link.html (1 hunks)
  • tests/test_data/task_3055_complete.html (1 hunks)
Files skipped from review due to trivial changes (1)
  • tests/test_data/submit_3055_form.html
Files skipped from review as they are similar to previous changes (3)
  • tests/test_client.py
  • tests/test_data/task_3052_incomplete_no_submit_link.html
  • tests/test_data/task_3055_complete.html

@madeddie
Copy link
Owner Author

Merging this work now, adding a new issue to fix the parse_submit_results and its tests.

@madeddie madeddie merged commit 87451bf into main May 28, 2024
5 checks passed
@madeddie madeddie deleted the 54-refactor-split-out-client branch May 28, 2024 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor: split out client
1 participant