# Overview of Languages and Linguistics

## subfields of Linguistics

Phonetics and Phonology: Phonetics is the study of the physical properties of sounds in human language, including their production, transmission, and perception. Phonology is the study of the rules and patterns governing the organization of sounds in a language.

Grammar: The set of rules and principles governing the structure of sentences in a language, including word formation, syntax, and semantics.

- Morphology: The study of the structure and formation of words in a language, including the rules for combining morphemes to create new words.

- Syntax: The study of the rules governing the structure and organization of sentences, including the relationships between words, phrases, and clauses.

Semantics: The study of meaning in language, including how meaning is structured, composed, and conveyed through words, phrases, and sentences.

Pragmatics: The study of how context influences the interpretation and use of language, including factors such as speaker intention, shared knowledge, and social context.

Discourse: The study of language in use, including how sentences and utterances are organized and structured to convey meaning and achieve communicative goals in spoken and written texts.

Sociolinguistics: The study of the relationship between language and society, including how language varies based on social factors such as region, social class, gender, and age.

Psycholinguistics: The study of the cognitive processes involved in language production, comprehension, and acquisition, as well as the neurological and psychological aspects of language.

Computational linguistics: An interdisciplinary field that combines linguistics, computer science, and artificial intelligence to develop algorithms and models for processing and understanding human language.

## Language and communication

Language and communication involve various components that interact in a complex manner to enable effective communication between a speaker and a listener. 

Here's a breakdown of the key components:

Speaker
- Intention: The goals, shared knowledge, and beliefs of the speaker guide the formation of a message to be conveyed.

- Generation (Tactical): The process of constructing a message (sentence or utterance) based on the speaker's intention.

- Synthesis (Text or Speech): The final stage in the speaker's role, where the generated message is presented either in written (text) or spoken (speech) form.

Listener

- Perception: The listener receives the message, processing the text or speech signals.

- Interpretation: The listener interprets the message through multiple levels:

    - Syntactic 句法: Analyzing the structure and grammar of the message.

    - Semantic 语义: Determining the meaning of the message.

    - Pragmatic 语用: Understanding the intended message, considering context and non-literal meanings.

- Incorporation: The listener internalizes and understands the message, integrating it into their existing knowledge and beliefs.

Both (Speaker and Listener)

- Context: Both the speaker and listener rely on the context in which the communication takes place. Context provides a shared understanding and helps to disambiguate and clarify the intended meaning.

## source of ambiguity

<table>
  <thead>
    <tr>
      <th>Source</th>
      <th>Type</th>
      <th>Example</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td rowspan="7">Syntactic</td>
      <td rowspan="3">Attachment</td>
      <td>Noun Phrase: He fed her cat food (who was fed? cat or her)</td>
    </tr>
    <tr>
      <td>Verb Phrase: The chicken is ready to eat (chicken eats or be eaten?)</td>
    </tr>
    <tr>
      <td>Prepositional Phrase: I shoot an elephant in a pajama (who is in pajama?)</td>
    </tr>
    <tr>
      <td rowspan="2">Coordination</td>
      <td>Conjunctive: I like red apple and pear (is pear also red?)</td>
    </tr>
    <tr>
      <td>Disjunctive: coffee or tea? (could be either choosing one option or both)</td>
    </tr>
    <tr>
      <td >Parallellism</td>
      <td>The coach told the players they should exercise regularly, eat well, and to get enough sleep (exercise and eat in order to get enough sleep or 3 things in parallel?)</td>
    </tr>
    <tr>
      <td >Ellipsis</td>
      <td>He offered to help, but she refused (she refused his help or she refused to offer help?)</td>
    </tr>
    <tr>
      <td rowspan="5">Semantic</td>
      <td rowspan="4">Lexical</td>
      <td>Homograph: "bow" (a weapon used for shooting arrows) vs. "bow" (a knot made with a ribbon or a string)</td>
    </tr>
    <tr>
      <td>Homophone: "write" (to compose text) vs. "right" (correct or a direction)</td>
    </tr>
    <tr>
      <td>Polysemy: "bank" (a financial institution) vs. "bank" (the side of a river)</td>
    </tr>
    <tr>
      <td>Metonymy: "The White House announced a new policy." (referring to the U.S. president, not the building itself)</td>
    </tr>
    <td>combinational</td><td>overlap with syntactic</td>
    <tr>
      <td rowspan="3">Pragmatic</td>
      <td>Referential</td><td>Joe yelled at Mike. He had broken the bike. (who broke the bike?)</td>
    </tr>
    <tr>
      <td>Implicature</td><td>"Can you pass the salt?" (The speaker is actually requesting the salt, not asking about the listener's ability to pass it.)</td>
    </tr>
    <tr>
      <td>Stress</td><td>"I didn't say she stole my money." (5 interpretation)</td>


## Language universal

Language universals are features or characteristics that are common to all human languages. 

Understanding language universals helps linguists and NLP researchers to better grasp the underlying structures and patterns that are shared across languages, which can inform the design of more efficient and accurate NLP systems.


They can be classified into two types:

- Unconditional universals: These are features that are found in every language without exception.

    - All languages have verbs and nouns: Every language has a way to represent actions (verbs) and entities (nouns).
    
    - All spoken languages have consonants and vowels: All spoken languages use a combination of consonant and vowel sounds to form words and convey meaning.

- Conditional universals: These are features that, if present in a language, entail the presence of other features or follow specific patterns.

    - if a language has declarative sentences with nominal subject and object, the typical word order will be subject-object (e.g., "John ate the apple").

    - if a language uses inflection to change the form of words (e.g., adding -s to make a noun plural in English), it will also have a system of derivation that allows the creation of new words by adding affixes or altering the base form of a word (e.g., adding -ness to create the noun "happiness" from the adjective "happy").


### Example

The World Atlas of Language Structures (WALS) is a database that provides information on various features of languages from around the world.

Feature 83A: Order of Object and Verb. This feature investigates the dominant word order of object and verb in different languages. 

- 713 languages with Object-Verb (OV) order

- 705 languages with Verb-Object (VO) order

- 101 languages with no dominant order


Feature 18A: Absence of common consonants 辅音: This feature studies the absence of certain common consonant types in languages:

- No bilabials: 5 languages

- No fricatives: 49 languages

- No nasals: 12 languages

Feature 67A: Inflectional future tense: This feature examines the presence or absence of inflectional future tense markers in languages:

- Yes (inflectional future tense present): 110 languages

- No (inflectional future tense absent): 112 languages