Skip to content

Conversation

@misrasaurabh1
Copy link
Contributor

@misrasaurabh1 misrasaurabh1 commented Nov 10, 2025

PR Type

Enhancement, Tests


Description

  • Add common tags computation utility

  • Introduce unit tests for core logic

  • Handle empty input safely


Diagram Walkthrough

flowchart LR
  A["common_tags.py: find_common_tags"] -- "iterates over articles" --> B["compute intersection of tags"]
  B -- "returns" --> C["set of common tags"]
  D["tests/test_common_tags.py"] -- "validates" --> A
Loading

File Walkthrough

Relevant files
Enhancement
common_tags.py
Implement common tags computation helper                                 

codeflash/result/common_tags.py

  • Introduce find_common_tags function.
  • Handle empty input by returning empty set.
  • Compute common tags across articles via intersection logic.
+11/-0   
Tests
test_common_tags.py
Add unit tests for common tags utility                                     

tests/test_common_tags.py

  • Add tests covering typical cases.
  • Verify consistent result across multiple inputs.
+22/-0   

Signed-off-by: Saurabh Misra <misra.saurabh1@gmail.com>
@github-actions
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Performance

The intersection is computed using list comprehensions repeatedly, resulting in O(nmk) behavior and repeated membership scans; converting to set intersection would be more efficient and clearer, especially for larger tag lists.

common_tags = articles[0].get("tags", [])
for article in articles[1:]:
    common_tags = [tag for tag in common_tags if tag in article.get("tags", [])]
return set(common_tags)
Duplicates Handling

Starting with a list for common_tags and using list filtering may retain duplicates if the first article contains duplicate tags; consider normalizing to a set at the start to ensure semantic correctness.

common_tags = articles[0].get("tags", [])
for article in articles[1:]:
    common_tags = [tag for tag in common_tags if tag in article.get("tags", [])]
return set(common_tags)
Missing Edge Tests

Add tests for empty input, missing tags key, empty tag lists, and duplicate tags to validate the stated "Handle empty input safely" and set semantics.

def test_common_tags_1() -> None:
    articles_1 = [
        {"title": "Article 1", "tags": ["Python", "AI", "ML"]},
        {"title": "Article 2", "tags": ["Python", "Data Science", "AI"]},
        {"title": "Article 3", "tags": ["Python", "AI", "Big Data"]},
    ]

    expected = {"Python", "AI"}

    assert find_common_tags(articles_1) == expected

    articles_2 = [
        {"title": "Article 1", "tags": ["Python", "AI", "ML"]},
        {"title": "Article 2", "tags": ["Python", "Data Science", "AI"]},
        {"title": "Article 3", "tags": ["Python", "AI", "Big Data"]},
        {"title": "Article 4", "tags": ["Python", "AI", "ML"]},
    ]

    assert find_common_tags(articles_2) == expected

@github-actions
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Use set intersections

Convert tag lists to sets and use set intersection for correctness and performance.
This avoids O(n^2) membership checks and preserves case-insensitive behavior only if
intended.

codeflash/result/common_tags.py [8-11]

-common_tags = articles[0].get("tags", [])
+common_tags = set(articles[0].get("tags", []))
 for article in articles[1:]:
-    common_tags = [tag for tag in common_tags if tag in article.get("tags", [])]
-return set(common_tags)
+    common_tags &= set(article.get("tags", []))
+return common_tags
Suggestion importance[1-10]: 7

__

Why: Converting lists to sets and using intersection simplifies the logic and improves performance while preserving behavior; the improved_code correctly reflects the intended change for the specified lines.

Medium
Possible issue
Validate tags shape

Guard against missing or non-list tags values to prevent runtime errors. Normalize
by treating absent or invalid tags as an empty set.

codeflash/result/common_tags.py [5-11]

 if not articles:
     return set()
+# Normalize first article tags
+first_tags = articles[0].get("tags", [])
+common_tags = set(first_tags if isinstance(first_tags, list) else [])
+for article in articles[1:]:
+    tags = article.get("tags", [])
+    common_tags &= set(tags if isinstance(tags, list) else [])
+return common_tags
Suggestion importance[1-10]: 6

__

Why: Adding type checks for tags can prevent runtime issues if inputs deviate from expected lists; while reasonable and accurate, it handles an edge case not shown in the tests and is a modest robustness improvement.

Low

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Nov 10, 2025

This PR is now faster! 🚀 Saurabh Misra accepted my code suggestion above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants