Skip to content
Permalink
Browse files

update README, some latex fixes

  • Loading branch information...
thomas--graf committed Jan 25, 2015
1 parent 9b0dec8 commit 7fed9f2e9d1b020842ee02a3884407977d144fc5
Showing with 238 additions and 11 deletions.
  1. +25 −3 README.md
  2. +206 −0 curriculum.mdown
  3. +6 −7 main.tex
  4. +1 −1 mypackages.sty
@@ -7,13 +7,13 @@ Overview

This is the primary course website for Computational Linguistics 2 (Lin 637), offered by the [Department of Linguistics] [department] at [Stony Brook University] [sbu]. For a brief list of topics, check the [syllabus] [syllabus].

This repository is publicly available and hosts the LaTeX source code for the lecture notes. Compiled pdfs of each chapter are available under the [pdf] [pdf] folder.
This repository is publicly accessible and hosts the LaTeX source code for the lecture notes. Compiled pdfs of each chapter are available in the [pdf] [pdf] folder.


Prerequisites
-------------

This course assumes a certain degree of familiarity with generative syntax, phonology, and basic mathematics (sets, functions, relations, first-order logic). Please take the [online survey] [survey] to ensure that you satisfy the prerequisites. If you have weaknesses, consult the relevant material suggested in the [readings repository] [readings].
This course assumes a certain degree of familiarity with generative syntax, phonology, and basic mathematics (sets, functions, relations, first-order logic). Please take the [online survey] [survey] to ensure that you satisfy the prerequisites. If you have weaknesses, consult the relevant material suggested in the [readings repository] [readings] (access restricted to enrolled students).

In addition, you will have to use Python, markdown and LaTeX at various points during this course. The [link list](#Link_List) at the end of this document has some useful tutorials.

@@ -51,8 +51,30 @@ If you want to compile the lecture notes yourself, or use them as the basis for
Link List
---------

coming soon
### Using git

- [Github app for Windows](http://windows.github.com) (supports only Windows 7 or later)
- [Github app for Mac](htp://mac.github.com) (supports only OS X 10.9 or later)
- List of alternative [GUI clients for git]("http://git-scm.com/downloads/guis")
- Tutorials for using [git via the command line]("https://www.atlassian.com/git/tutorials")
- Official [documentation for git]("http://git-scm.com/doc")

### Markdown

- Interactive tutorial to [markdown basics]("http://markdowntutorial.com/")
- [Complete markdown syntax]("http://daringfireball.net/projects/markdown/syntax")
- Overview of [Github's markdown dialect]("https://help.github.com/categories/writing-on-github/")

### LaTeX

- [Overleaf]("https://www.overleaf.com/") (formerly writeLaTeX) is an online LaTeX editor with live preview
- List of [commonly used math symbols]("http://www.artofproblemsolving.com/Wiki/index.php/LaTeX:Symbols")
- Andrew Roberts' [Getting to Grips with LaTeX]("http://www.andy-roberts.net/writing/latex")

### Python

- A succinct yet extensive [tutorial for Python 3]("http://www.python-course.eu/python3_course.php")
- The official [Python 3 documentation]("https://docs.python.org/3/")

[department]: http://linguistics.stonybrook.edu
[pdf]: ../../tree/master/pdf
@@ -0,0 +1,206 @@
## Who am I

## What is Computational Linguistics?

- what we barely cover
- probabilistic methods (JurafskyMartin, ManningSchütze, GemanJohnson2003)

- language as a computational problem
- how is language computed
- cognitive
- applications (learning from the masters)
- what are its computational properties
- can we use these properties to make sense of empirical phenomena
- do linguistic domains exhibit computational differences
- are linguistic ideas about computability/economy plausible?

- readings:
Penn 2006: Symbolic Computational Linguistics
Pullum & Kornai: Mathematical Linguistics
Kornai: Mathematical Linguistics, Ch1 & 10
Savitch & Manaster-Ramer: Generative Capacity Matters
Krahmer10: Computational Linguistics and Psychology
Wilks: Computational Linguistics History
RiggleFeatureChart

- phonology

- segments and strings
- formalizing strings
- how does formalization proceed?
- set out axioms, base terms
- define complex concepts in terms of these simpler ones
- definition must be precise enough that one can tell for any object in the domain of study whether is satisfies the definition or not
- give examples of bad definitions from literature (e.g. Norvin Richards thesis)
- why bother with formalization?
- Chomsky quote; see also my thesis; Müller's 3.7.2
- circle vs linearly ordered graph; which one is a string?
- formalization VS implementation
- python implementation is not in terms of sets with ordering function
- python makes additional distinctions (list VS string)

- string languages
- is phonology infinite?
- why we assume it nonetheless
- nonce words follow a system --> generalization
- succinctness
- Savitch paper

- dependencies
- local
- non-local (why don't we model it as local?)
- existence/absence conditions
- uniqueness conditions (tone?)
- interval conditions

- how would we code this up?
- bigrams
- k-factor; local interpretation
- conjunction of negated literals; string as model of formula
- "if you can't say it in two different ways, then you can't say it at all"
- closure properties: complementation, intersection, not union, relabeling ( b --> a; (ab)* --> (aa)* )
- local substring substitution closure
- boolean algebra of grammars
- learnability
- lattice structure
- adding probabilities
- inferring probabilities
- smoothing techniques
- probabilistic algebra (associativity --> doesn't matter if we scan left to right!)
- reading: SmithJohnson on WCFGs and PCFGs
- generalization to n-grams

- up the ladder
- strictly piecewise
- k-factor with precendece interpretation
- conjunction of negated literals
- intersection of good tails (where does this belong? check Jeff's thesis)
- locally testable (at least one)
- boolean closure
- locally threshold testable (exactly one; primary stress)
- existential quantification
- star-free
- first-order logic
- interval conditions
- counter-free languages
- reading: Pullum&Rogers () Animal Pattern Learning Experiments

- finite-state
- hidden alphabet bigrams
- automata
- automaton constructions
- complementation
- union
- intersection
- Myhill-Nerode
- regular expressions
- pumping lemma
- non-determinism
- powerset construction (size VS speed trade-off)
- mso
- connection between non-determinism and existential quantification
- phonology: primary stress in Creek and Cairene Arabic

- finite-state semantics
- describing event structure
- generalized quantifiers

- transductions
- finite-state
- subsequential
- closure properties
- application to phonology and morphology (2-level morphology)
- equivalence of SPE and OT
Readings:
Riggle Thesis, Generating Contenders, Violation Semirings
PyPhon

- Automaton-Grammar connection --> switch to trees --> syntax

- Literature
- Heinz survey papers
- Bird Computational Phonology
- BirdEllison on Autosegmental Phonology
- Heinz on Tier-local Phonology
- Heinz thesis
- McNaugton & Pappert
- KeenanMoss
- Sipser
- Kozen
- HopcroftUllman
- RegMSO equivalence (Morawietz)
- Kenstowicz06 Phonology survey paper

- syntax

- weak generative capacity: syntax is not regular
reading: MohriSproat On A Common Fallacy
HeinzIdsardi (Science and TopiCS)

- can probabilities salvage regular models?
- hidden markov models
- yes and no
- do increase performance
- do not provide right structures for semantic interpretation
- probabilities conflate many issues
- colocation/transition probability (I shiveringly admonished his popsicle; colorless green ideas sleep furiously)
- word frequency (vex VS irritate, erudite VS educated)
- world-knowledge (I saw [a movie with Heidecker] VS I saw [a movie] [with Tim]))

- formalizing trees
- graph
- Gorn-domains

- local tree languages/CFGs
- subtree substitution closure
- feature grammars/unification
- head projection/category refinement
- tree intersection != string intersection

- recognizable tree languages
- CFL string yields (easily proved via Thatcher's theorem)
- reading: Rogers96 Strictly Local: Recognizable

- weak generative capacity: syntax is not context-free

- TAG
- MGs

- 2-step perspective

- tree transductions
- synchronous grammars
- tree transducers
- logical tree transductions
- new perspective of the T-model

- Literature
- GecsegSteinby
- Fülöp book
- Comon et al
- TAG anthology
- Kobele06
- Trautwein Computational Pitfalls
- Müller Syntax Textbook

- parsing

algorithmic concepts
bigO
binary search

data structures
string
list
linked list
stack
array
hash table
adjacency matrix
adjacency list
priority queue

techniques
divide and conquer
dynamic programming, memoization
linear programming
@@ -3,20 +3,21 @@
%=================================================================
\documentclass[11pt,letterpaper]{book}

\newcommand{\theauthor}{Thomas Graf}
\newcommand{\theauthor}{Thomas Graf}
\newcommand{\lastname}{Graf}
\newcommand{\university}{Stony Brook University}
\newcommand{\emailaddress}{lin637@thomasgraf.net}
\newcommand{\coursenumber}{Lin637}
\newcommand{\coursename}{Computational Linguistics 2}
\newcommand{\thetitle}{\texorpdfstring{\coursenumber\\ \coursename}{\coursenumber --- \coursename}}
\newcommand{\semester}{Spring 2015}
\newcommand{\thetitle}{\coursename\ [\coursenumber, \semester]}
\newcommand{\thekeywords}{graduate level, lecture, computational linguistics, phonology, syntax}
\newcommand{\thedate}{}

\usepackage{mypackages}
\usepackage{mycommands}



%=================================================================
% title format
%=================================================================
@@ -29,10 +30,11 @@
%=================================================================
% \includeonly{./tex/ConstituencyTests}

\pagestyle{empty}
\begin{document}
\maketitle
\raggedbottom
\pagenumbering{Roman}
\maketitle
\tableofcontents
\clearpage

@@ -44,9 +46,6 @@
\include{./tex/learnabilitysl}
\include{./tex/probabilisticlocal}

\pagestyle{empty}
% \include{./tex/h1}

\bibliographystyle{../../linquiry3}
\bibliography{../../universal,../../graf}
\end{document}
@@ -160,7 +160,7 @@
% first we remove the standard headers and footers
\fancyhf{}
\fancyheadoffset[RO,LE]{\marginparsep+\marginparwidth}
\fancyhead[C]{\HeaderFontSize Graf - Computational Linguistics 2, Spring 2015}
\fancyhead[C]{\HeaderFontSize \lastname\ - \coursename, \semester}
\fancyhead[RO,LE]{\bfseries \thepage}

\fancypagestyle{plain}{

0 comments on commit 7fed9f2

Please sign in to comment.
You can’t perform that action at this time.