Skip to content

apjanco/spaCy_workshops

Repository files navigation

Monday

Opening session

13:30-14:00, Kari Tanácsterem (A039)

14:00-15:45 A330

  • Variables
  • Data Types
  • Operators
  • Loops
  • Indexing
  • Functions
  • Classes
  • Flow Control

16:15-18:00, Leonard

  • Input/Ouptut
  • Modules
  • LXML
  • Pandas
  • Plotting

If you would like a free Prodigy license, please fill out the form here

Tuesday

9:00-10:45, Andy

  • Python strings
  • Language objects, doc, sents, tokens
  • POS
  • NER (w/ pre-trained models)
  • displacy
  • available spaCy models
  • Adding models from spacy-stanfordnlp

11:15-13:00, David

  • standoff converter
  • adding extensions
  • automated markup
    • NER
    • linguistic features

14:00-15:15, Andy and David (short session)

  • Rule-based matching
  • adding/pipelines
  • if time, fasttext, MUSE

15:45-17:00, Andy

  • training data
  • training spaCy models (ner, textcat, pos, dep, semantic similarity)
  • Prodigy
  • discussion

Wednesday

(9:00-10:45, Andy and David

  • spaCy IRL

  • course.spacy.io

  • other learning resources

  • scattertext (finding distinguishing terms in small-to-medium-sized corpora, and presenting them in a sexy, interactive scatter plot with non-overlapping term labels)

  • Named Entity Linking

  • spacy-pytorch-transformers & spacy pretrain

  • concluding discussion