Skip to content

Conversation

@finitearth
Copy link
Collaborator

@finitearth finitearth commented Jul 21, 2025

  • added mypy type checking
  • fixed tests
  • standardized to use list of floats and list of str (as we are not using matrices any longer)

@finitearth finitearth changed the base branch from feature/RewardTask to main July 21, 2025 14:46
@finitearth finitearth marked this pull request as ready for review July 21, 2025 14:46
@finitearth finitearth requested a review from mo374z as a code owner July 21, 2025 14:46
@finitearth finitearth changed the base branch from main to feature/RewardTask July 21, 2025 14:46
@finitearth finitearth requested a review from Copilot July 21, 2025 17:40
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds mypy type checking to the pre-commit hooks and significantly improves typing across the codebase, while standardizing on the use of Python lists instead of numpy arrays for better type consistency.

  • Added mypy configuration and pre-commit integration with proper exclusions
  • Standardized data structures to use List[str] and List[float] instead of numpy arrays
  • Enhanced type annotations across all modules with proper imports and overloads

Reviewed Changes

Copilot reviewed 42 out of 43 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pyproject.toml Added mypy dependency and configuration settings
.pre-commit-config.yaml Integrated mypy into pre-commit hooks
promptolution/tasks/base_task.py Major refactor with type annotations and list-based data structures
promptolution/predictors/base_predictor.py Updated predictor interface with proper typing
promptolution/llms/base_llm.py Added comprehensive type annotations for LLM interface
tests/mocks/mock_task.py Updated mock implementation to match new typing patterns
Various test files Updated test assertions to work with list-based returns
Comments suppressed due to low confidence (1)

if shape is not None:
results_array = results_array.reshape(shape)

return results_array
Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method returns a numpy array but the type annotation indicates it should return List[str]. This creates a type inconsistency.

Suggested change
return results_array
return results_array.tolist()

Copilot uses AI. Check for mistakes.
Comment on lines +320 to +321
candidates = list(compress(candidates, survivor_mask))
block_scores = list(compress(block_scores, survivor_mask))
Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using compress on block_scores (List[List[float]]) with a boolean mask will not work as expected. This should filter the list elements by index, not use compress which expects an iterable.

Suggested change
candidates = list(compress(candidates, survivor_mask))
block_scores = list(compress(block_scores, survivor_mask))
candidates = [candidates[idx] for idx, mask in enumerate(survivor_mask) if mask]
block_scores = [block_scores[idx] for idx, mask in enumerate(survivor_mask) if mask]

Copilot uses AI. Check for mistakes.
Comment on lines +314 to +315
self.xs = [x for i, x in enumerate(self.xs) if i not in indices]
self.ys = [y for i, y in enumerate(self.ys) if i not in indices]
Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using 'i not in indices' in a list comprehension has O(n*m) complexity where m is the size of indices. For better performance, convert indices to a set first: 'indices_set = set(indices)'.

Suggested change
self.xs = [x for i, x in enumerate(self.xs) if i not in indices]
self.ys = [y for i, y in enumerate(self.ys) if i not in indices]
indices_set = set(indices)
self.xs = [x for i, x in enumerate(self.xs) if i not in indices_set]
self.ys = [y for i, y in enumerate(self.ys) if i not in indices_set]

Copilot uses AI. Check for mistakes.
xs: List[str] = []
ys: List[str] = []
for label, num_samples in zip(unique_labels, samples_per_class):
indices = np.where(task.ys == label)[0]
Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using numpy operations on task.ys which is now a List[str]. This will fail since numpy comparison doesn't work the same way with Python lists.

Suggested change
indices = np.where(task.ys == label)[0]
indices = np.where(np.array(task.ys) == label)[0]

Copilot uses AI. Check for mistakes.
) -> Union[np.ndarray, Tuple[np.ndarray, Union[List[Any], np.ndarray]]]:
) -> Union[List[float], List[List[float]], Tuple[List[List[float]], List[List[str]]]]:
"""Collects all results for the current batch from the cache and formats them."""
assert not (return_agg_scores and return_seq), "Cannot return both aggregated scores and sequences"
Copy link

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is duplicated at line 253. The validation should be done once at the beginning of the method rather than twice.

Suggested change
assert not (return_agg_scores and return_seq), "Cannot return both aggregated scores and sequences"
# Removed duplicate assertion as it is already validated at the beginning of the method.

Copilot uses AI. Check for mistakes.
@finitearth finitearth requested a review from timo282 July 22, 2025 13:58
@timo282 timo282 mentioned this pull request Aug 31, 2025
@finitearth finitearth merged commit 51e574f into feature/RewardTask Sep 2, 2025
@finitearth finitearth deleted the chore/add_mypy branch September 2, 2025 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants