# Computing the H-Index
- Given array of positive integers 
- find the largest `h` such that there are at least `h` entries in the array that are greater than or equal to `h` 
---
#### From Introduction* Pg. 1
- h-index = metric to measure productivity and citation impact of a researcher 
    - largest number `h` such that researcher has been:
        - published `h` paper
        - cited `h` times 
- input? -> array of positive integers representing the citation counts of each of the authors papers 
    - may or may not be sorted (allowed to modify this)  
--- 

In [1]:
from typing import List

In [2]:
def h_index_sorted(citations: List[int]) -> int:
    
    citations.sort()
    n = len(citations)
    # from left to right -> stops at 4
    for i,c in enumerate(citations):
        # i = published papers
        # c = citations 
        # entries greater than or equal to the count 
        if c >= n - i: 
            return n - i 
    return 0 

In [3]:
cite = [9,6,5,4,2]
h_index_sorted(cite)

4

#### `O(n log n)` time complexity
#### `O(1)` space complexity

---
### Input Array Sorted
---
- fastest as per [Toward Data Science article](https://towardsdatascience.com/fastest-way-to-calculate-h-index-of-publications-6fd52e381fee)

In [4]:
import numpy as np

def fastest(citations):
    
    citations = np.array(citations)
    n = citations.shape[0]
    array = np.arange(1,n+1)
    
    citations = np.sort(citations)[::-1]
    
    h_idx = np.max(np.minimum(citations,array))
    
    return h_idx

In [5]:
cited = [5,4,1,3,2,6,3,4,1]

fastest(cited)

4

---
##### return sum()

In [6]:
def reverse_sort_sum(citations: List[int]) -> int:    
    return sum(i < j for i,j in enumerate(citations))

In [7]:
cited = [5,4,1,3,2,6,3,4,1]
cited.sort(reverse=True)

reverse_sort_sum(cited)

4

In [8]:
import numpy as np

def numpy(citations):
    
    array = np.arange(1,len(citations)+1)
    return sum([p <= c for (c,p) in zip(citations, array)])

In [9]:
cited = [5,4,1,3,2,6,3,4,1]
cited.sort(reverse=True)

numpy(cited)

4

---
### As much additional memory as your want
---

In [10]:
import bisect

def memoryPlus(cited: List[int]) -> int: 
    cited.sort(reverse=True)
    h = []
    for i,c in enumerate(cited):
        if c >= len(h):
            h.append(c)
    return len(h)

In [11]:
cited = [5,4,1,3,2,6,3,4,1]
memoryPlus(cited)

4