URL testing lists intended for discovering website censorship
Python
Clone or download
Latest commit 0481129 Aug 7, 2018

README.md

Usage

What Is It?

Contained are URL testing lists intended to help in testing URL censorship, divided by country codes. In addition to these local lists, the global list consists of a wide range of internationally relevant and popular websites, including sites with content that is perceived to be provocative or objectionable. Most of the websites on the global list are in English. In contrast, the local lists are designed individually for each country by regional experts. They have content representing a wide range of categories at the local and regional levels, and content in local languages. In countries where Internet censorship has been reported, the local lists also include many of the sites that are alleged to have been blocked.

Categories are divided among four broad themes:

  • Political (This category is focused primarily on Web sites that express views in opposition to those of the current government. Content more broadly related to human rights, freedom of expression, minority rights, and religious movements is also considered here.)

  • Social (This group covers material related to sexuality, gambling, and illegal drugs and alcohol, as well as other topics that may be socially sensitive or perceived as offensive).

  • Conflict/Security (Content related to armed conflicts, border disputes, separatist movements, and militant groups is included in this category).

  • Internet Tools (Web sites that provide e-mail, Internet hosting, search, translation, Voice-over Internet Protocol (VoIP) telephone service, and circumvention methods are grouped in this category.)

More information about testing methodology can be found here.

The only testing list that applies regionally (more than one country) is the CIS testing list which is intended for testing former Commonwealth of Independent States nations.

Lists are available in both CSV and JSON format.

Please note that these lists are not the entirety of testing lists but rather just the newest list for every unique country code.

Contributing URLs

To learn how to contribute URLs for testing see: https://ooni.torproject.org/get-involved/contribute-test-lists/

Citation

If using this dataset in a publication, please see the following BibTeX File format.

@misc{testlist,
  title={URL testing lists intended for discovering website censorship},
  author={Citizen Lab and Others},
  year={2014},
  url={https://github.com/citizenlab/test-lists},
  note={\href{https://github.com/citizenlab/test-lists}{https://github.com/citizenlab/test-lists}}
}

An example Chicago Style citation is included below:

Citizen Lab and Others. 2014. URL Testing Lists Intended for Discovering Website Censorship. https://github.com/citizenlab/test-lists.

License

All data is provided under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International and available in full here and summarized here