Skip to content

Repository

Giannis Daras edited this page Aug 12, 2018 · 1 revision

Repository and Project Structure

Why it is not a fork of spaCy

This repository is not a fork of spaCy. It gets synced regularly with the upstream (spacy repo), but it is not a direct fork of this repo because:

  1. It's main intension is to collect all the resources/code/data related to Natural Language Processing of Greek language with spaCy.
  2. Being an independent repo allows you for more experimentation. In this repo you can include a lot more files, try a lot more new things and approaches. If you believe that you have finally produced something that is worthy of a pull request to spaCy you can fork the spaCy branch and include there only the required changes for spaCy from this repository. For example, both pull requests to spaCy for Greek language (see here and here), started here but were opened from a different (personal) branch.

Branches

The base branch in this repo is the dev branch. That's because this repository is exactly for that: development.

In the dev branch you will find the following (regarding the Greek language):

  1. Everything included to spaCy latest release for Greek language or an update of it.
  2. A res folder in which you will find resources related to the Natural Language Processing of Greek language. Important note: Here you will find the modules folder which contains all the independent modules that were developed as deliverables of the Google Summer of Code Project.
  3. A training folder in which you will find all the code/data resources needed for the reproduction and the improvement of the models.
  4. A lot more files and scripts that were developed during the Google Summer of Code in order to succeed the addition of Greek language to spaCy and may help you understand better the code or add/improve a language.