Home

Welcome to the Hyphe wiki!

Info

You consider getting a server for Hyphe; what is the recommended specification?

Reference Tutorials

The following tutorials have been designed to cover all the basic features of Hyphe. The first tutorials focus more on explaining the elementary notions and techniques in the context of simple, top-down protocols. The last tutorials focus on more elaborate research designs and methodological questions.

Reference Tutorial 1: Website Structure

Analyse the structure of a website, in this case: https://climate.nasa.gov/

Creating a new corpus
Defining and crawling a web entity
Looking at the network of pages inside a web entity
Looking at the folder structure of the pages of a web entity

Reference Tutorial 2: Google Search Results

Study how the results of different Google queries are hyperlinked. In this case, about climate change.

Understanding the different statuses: IN, OUT, UNDECIDED & DISCOVERED
Changing the status of one or more web entities
Crawling all DISCOVERED web entities
Looking at a network of web entities

Reference Tutorial 3: Actors and their Ties

Retrace the network of a series of actors through hyperlinks. In this case, the IPCC author’s institutions.

Importing a CSV to crawl from a file
How to properly set the boundaries of a web entity
Detecting crawl issues and fixing start page errors
Tagging and advanced use of the Tagging page

About Hypertext Corpus Initiative

Hyphe is born from the Hypertext Corpus Initiative, a research group initiated by médialab Sciences Po in October 2010 to address the issue of building hypertext corpus for Social Sciences and potentially other domains.

The group first met at the kickoff workshop in médialab Sciences Po Paris on 7th and 8th October 2010.

The HCI's ambitions are to discuss on methodology about Hypertext corpus and to integrate web mining existing tool in a common technical chain.

Hyphe has been funded by the Equipex DIME-SHS (ANR-10-EQPX-19-01).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly