Skip to content
@Taiwan-Social-Media-Corpus

Taiwan Social Media Corpus

台灣的重要社群媒體語料庫,供學術研究使用。

Welcome to software development at the National Taiwan University (NTU) 🙌

We are building a large-scaled, diverse and linguistically-enriched social media corpus of Mandarin in Taiwan

Pinned Loading

  1. async-scraptt async-scraptt Public

    A Python web scraper for extracting post content and comments from PTT website.

    Python 1 3

  2. _rosetta _rosetta Public

    A large-scaled, diverse and linguistically-enriched social media corpus of Mandarin in Taiwan.

    TypeScript

Repositories

Showing 9 of 9 repositories
  • mercury Public
    Taiwan-Social-Media-Corpus/mercury’s past year of commit activity
    JavaScript 0 0 2 0 Updated Jul 30, 2024
  • sol Public
    Taiwan-Social-Media-Corpus/sol’s past year of commit activity
    TypeScript 0 0 2 0 Updated Jul 29, 2024
  • _rosetta Public

    A large-scaled, diverse and linguistically-enriched social media corpus of Mandarin in Taiwan.

    Taiwan-Social-Media-Corpus/_rosetta’s past year of commit activity
    TypeScript 0 0 1 0 Updated Jul 9, 2024
  • async-scraptt Public

    A Python web scraper for extracting post content and comments from PTT website.

    Taiwan-Social-Media-Corpus/async-scraptt’s past year of commit activity
    Python 1 Apache-2.0 3 0 0 Updated Jun 23, 2024
  • ckip-2-tei Public

    A Python package that asynchronously segments JSON data into TEI XML format.

    Taiwan-Social-Media-Corpus/ckip-2-tei’s past year of commit activity
    Python 0 Apache-2.0 0 0 0 Updated Apr 29, 2024
  • blacklab-demo Public

    A repo that demonstrates how to build Blacklab corpus via Docker and Nginx.

    Taiwan-Social-Media-Corpus/blacklab-demo’s past year of commit activity
    Shell 0 0 0 0 Updated Apr 29, 2024
  • corpus-frontend Public

    A large-scaled, diverse and linguistically-enriched social media corpus of Mandarin in Taiwan.

    Taiwan-Social-Media-Corpus/corpus-frontend’s past year of commit activity
    TypeScript 0 0 0 0 Updated May 17, 2023
  • .github Public

    We are building a large-scaled, diverse and linguistically-enriched social media corpus of Mandarin in Taiwan

    Taiwan-Social-Media-Corpus/.github’s past year of commit activity
    0 0 0 0 Updated Dec 10, 2022
  • scraptt Public

    The most comprehensive PTT (踢踢踢) Crawler

    Taiwan-Social-Media-Corpus/scraptt’s past year of commit activity
    Python 3 1 0 0 Updated Sep 2, 2018

Top languages

Loading…

Most used topics

Loading…