**Query:** What are the most popular Python repositories that have been created in the last year?

In [2]:
import requests

# Define headers.
headers = {
    "Accept": "application/vnd.github+json",
    "X-GitHub-Api-Version" : "2022-11-28",
    }

In [3]:
def run_query(url):
    """Run a query, and return list of repo dicts."""
    print(f"Query URL: {url}")
    r = requests.get(url, headers=headers)
    print(f"Status code: {r.status_code}")

    # Convert the response object to a dictionary.
    response_dict = r.json()

    # Show basic information about the query results.
    print(f"Total repositories: {response_dict['total_count']}")
    
    complete_results = not response_dict['incomplete_results']
    print(f"Complete results: {complete_results}")
    
    # Pull the dictionaries for each repository returned.
    repo_dicts = response_dict['items']
    print(f"Repositories returned: {len(repo_dicts)}")
    
    return repo_dicts

In [4]:
def summarize_repos(repos):
    """Summarize a set of repositories."""
    for repo in repos:
        name = repo['name']
        stars = repo['stargazers_count']
        owner = repo['owner']['login']
        description = repo['description']
        link = repo['html_url']
        
        print(f"\nRepository: {name} ({stars})")
        print(f"  Owner: {owner}")
        print(f"  Description: {description}")
        print(f"  Repository: {link}")

In [5]:
url = "https://api.github.com/search/repositories"
url += "?q=language:python+stars:>1000"
url += "+created:2022-06-01..2023-06-01"
url += "&sort=stars&order=desc"

repo_dicts = run_query(url)
summarize_repos(repo_dicts)

Query URL: https://api.github.com/search/repositories?q=language:python+stars:>1000+created:2022-06-01..2023-06-01&sort=stars&order=desc
Status code: 200
Total repositories: 386
Complete results: True
Repositories returned: 30

Repository: Auto-GPT (138907)
  Owner: Significant-Gravitas
  Description: An experimental open-source attempt to make GPT-4 fully autonomous.
  Repository: https://github.com/Significant-Gravitas/Auto-GPT

Repository: stable-diffusion-webui (82644)
  Owner: AUTOMATIC1111
  Description: Stable Diffusion web UI
  Repository: https://github.com/AUTOMATIC1111/stable-diffusion-webui

Repository: langchain (46647)
  Owner: hwchase17
  Description: ⚡ Building applications with LLMs through composability ⚡
  Repository: https://github.com/hwchase17/langchain

Repository: gpt4free (39863)
  Owner: xtekky
  Description: decentralising the Ai Industry, just some language model api's...
  Repository: https://github.com/xtekky/gpt4free

Repository: whisper (38820)
  Owner: 

**Query:** What are the most popular Python repositories that have been created in the last year, that aren't focused on AI?

In [6]:
url = "https://api.github.com/search/repositories"
url += "?q=language:python+stars:>1000"
url += "+NOT+gpt+NOT+llama+NOT+chat+NOT+llm+NOT+diffusion"
url += "+created:2022-06-01..2023-06-01"
url += "&sort=stars&order=desc"

repo_dicts = run_query(url)
summarize_repos(repo_dicts)

Query URL: https://api.github.com/search/repositories?q=language:python+stars:>1000+NOT+gpt+NOT+llama+NOT+chat+NOT+llm+NOT+diffusion+created:2022-06-01..2023-06-01&sort=stars&order=desc
Status code: 200
Total repositories: 203
Complete results: True
Repositories returned: 30

Repository: whisper (38820)
  Owner: openai
  Description: Robust Speech Recognition via Large-Scale Weak Supervision
  Repository: https://github.com/openai/whisper

Repository: TaskMatrix (33241)
  Owner: microsoft
  Description: None
  Repository: https://github.com/microsoft/TaskMatrix

Repository: stanford_alpaca (24925)
  Owner: tatsu-lab
  Description: Code and documentation to train Stanford's Alpaca models, and generate the data.
  Repository: https://github.com/tatsu-lab/stanford_alpaca

Repository: babyagi (15280)
  Owner: yoheinakajima
  Description: None
  Repository: https://github.com/yoheinakajima/babyagi

Repository: so-vits-svc (14977)
  Owner: svc-develop-team
  Description: SoftVC VITS Singing 

Filter even more AI-related posts.

In [7]:
def prune_repos(repos):
    """Return only non AI-related repos."""
    ai_terms = [
        'gpt', 'llama', 'chat', 'llm', 'diffusion', 'alpaca',
        ' ai', 'ai ', 'ai-', '-ai', 'openai', 'whisper',
        'rlhf', 'language model', 'langchain', 'transformer', 'gpu',
        'copilot', 'deep', 'embedding', 'model', 'pytorch',
    ]
    
    non_ai_repos = []
    for repo in repos:
        # Check for ai terms in name, owner, and description.
        name = repo['name'].lower()
        if any(ai_term in name for ai_term in ai_terms):
            continue
            
        owner = repo['owner']['login'].lower()
        if any(ai_term in owner for ai_term in ai_terms):
            continue
        
        #  Prune repos that don't have a description.
        if not repo['description']:
            continue

        description = repo['description'].lower()
        if any(ai_term in description for ai_term in ai_terms):
            continue
        
        non_ai_repos.append(repo)
    
    print(f"Keeping {len(non_ai_repos)} of {len(repos)} repos.")
    return non_ai_repos

In [8]:
pruned_repos = prune_repos(repo_dicts)
summarize_repos(pruned_repos)

Keeping 8 of 30 repos.

Repository: so-vits-svc (14977)
  Owner: svc-develop-team
  Description: SoftVC VITS Singing Voice Conversion
  Repository: https://github.com/svc-develop-team/so-vits-svc

Repository: sd-webui-controlnet (10010)
  Owner: Mikubill
  Description: WebUI extension for ControlNet
  Repository: https://github.com/Mikubill/sd-webui-controlnet

Repository: the-algorithm-ml (9510)
  Owner: twitter
  Description: Source code for Twitter's Recommendation Algorithm
  Repository: https://github.com/twitter/the-algorithm-ml

Repository: pynecone (9084)
  Owner: pynecone-io
  Description: 🕸 Web apps in pure Python 🐍
  Repository: https://github.com/pynecone-io/pynecone

Repository: AnimatedDrawings (8190)
  Owner: facebookresearch
  Description: Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
  Repository: https://github.com/facebookresearch/AnimatedDrawings

Repository: Monocraft (6671)
  Owner: IdreesInc
  Description: A monospaced program

Start with 100 repo dicts before pruning.

In [9]:
url = "https://api.github.com/search/repositories"
url += "?q=language:python+stars:>1000"
url += "+NOT+gpt+NOT+llama+NOT+chat+NOT+llm+NOT+diffusion"
url += "+created:2022-06-01..2023-06-01"
url += "&sort=stars&order=desc"
url += "&per_page=100&page=1"

repo_dicts = run_query(url)
pruned_repos = prune_repos(repo_dicts)
summarize_repos(pruned_repos)

Query URL: https://api.github.com/search/repositories?q=language:python+stars:>1000+NOT+gpt+NOT+llama+NOT+chat+NOT+llm+NOT+diffusion+created:2022-06-01..2023-06-01&sort=stars&order=desc&per_page=100&page=1
Status code: 200
Total repositories: 203
Complete results: True
Repositories returned: 100
Keeping 36 of 100 repos.

Repository: so-vits-svc (14977)
  Owner: svc-develop-team
  Description: SoftVC VITS Singing Voice Conversion
  Repository: https://github.com/svc-develop-team/so-vits-svc

Repository: sd-webui-controlnet (10010)
  Owner: Mikubill
  Description: WebUI extension for ControlNet
  Repository: https://github.com/Mikubill/sd-webui-controlnet

Repository: the-algorithm-ml (9510)
  Owner: twitter
  Description: Source code for Twitter's Recommendation Algorithm
  Repository: https://github.com/twitter/the-algorithm-ml

Repository: pynecone (9084)
  Owner: pynecone-io
  Description: 🕸 Web apps in pure Python 🐍
  Repository: https://github.com/pynecone-io/pynecone

Repository: 