Starred repositories
A curated list of resources for Chinese NLP 中文自然语言处理相关资料
Explore cultural collections along time, texture and themes
Given a scholarly PDF, extract figures, tables, captions, and section titles.
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
A very simple framework for state-of-the-art Natural Language Processing (NLP)
Entity linking system for Wikidata updated by your edits in real time
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
Download Spotify songs to mp3 with full metadata and cover art!
The fastai book, published as Jupyter Notebooks
Recreations of W.E.B Dubois's Data Portraits
Download YouTube comments from numerous videos, playlists, and channels for archiving, general search, and showing activity.
Universal JavaScript app, made with ECMAScript modules running natively in a browser or local environment 🖼️
🚇 An instance of councilmatic for LA Metro
Anonymising tweet downloader that uses the Twitter Academic API full-archive search
A Python wrapper around the topic modeling functions of MALLET.
Data conversions and examples for generating reports from twarc collections using tools such as D3.js
Code Repository for Clustering and Classification with Machine Learning in R, published by Packt
A command line tool (and Python library) for archiving Twitter JSON
SDK for running DeepLabCut on a live video stream
A shell script to set up a macOS laptop for web and mobile development.
Shiny app for dimension reduction using UMAP and t-SNE