## ***1148.Article Views I***

In [3]:
import pandas as pd

def article_views(views: pd.DataFrame) -> pd.DataFrame:
    # Filter rows where author_id == viewer_id (author viewed their own article)
    filtered = views[views['author_id'] == views['viewer_id']]
    # Select unique author_ids and rename to 'id'
    result = filtered[['author_id']].drop_duplicates().rename(columns={'author_id': 'id'})
    # Sort by 'id' in ascending order
    result = result.sort_values('id').reset_index(drop=True)
    return result


***Time and Space Complexity***

**Time Complexity**

Let n be the number of rows in the input DataFrame views.

1. Filtering (views['author_id'] == views['viewer_id']):

    - This is a vectorized comparison across n rows.

    - O(n)

2. Selecting and Renaming Columns:

    - Selecting a column and renaming is O(n).

3. Dropping Duplicates (drop_duplicates):

    - In the worst case, all rows are unique, so it checks all n rows.

    - O(n) (pandas uses hash tables for this operation)

4. Sorting (sort_values):

    - Sorting k unique author_ids, where k ≤ n.

    - O(k log k)

5. Resetting Index:

    - Linear in number of rows, O(k).

**Total Time Complexity:**

O(n + k log k)

Since k ≤ n, the dominant term is O(n log n) in the worst case.

**Space Complexity**

1. Filtering:

    - Creates a boolean mask of length n (O(n)), and a filtered DataFrame (up to n rows).

2. Dropping Duplicates:

    - Stores up to n unique author_ids (O(n)).

3. Sorting and Resetting Index:

    - Stores up to n unique ids (O(n)).


Example Usage

In [2]:
import pandas as pd

data = {
    'article_id': [1, 1, 2, 2, 3, 3, 3],
    'author_id':  [3, 3, 7, 7, 4, 4, 4],
    'viewer_id':  [5, 6, 7, 6, 1, 4, 4],
    'view_date': [
        '2019-08-01', '2019-08-02', '2019-08-01', '2019-08-02',
        '2019-07-22', '2019-07-21', '2019-07-21'
    ]
}

views = pd.DataFrame(data)

def article_views(views: pd.DataFrame) -> pd.DataFrame:
    result = views[views['author_id'] == views['viewer_id']]
    result = result[['author_id']].drop_duplicates().rename(columns={'author_id': 'id'})
    result = result.sort_values('id').reset_index(drop=True)
    return result

output = article_views(views)
print(output)


   id
0   4
1   7
