The task of removing inflectional endings only and to return the base dictionary form of a word which is also known as a lemma.
The process of reducing inflected (or sometimes derived) words to their root form. (e.g. "close" will be the root for "closed", "closing", "close", "closer" etc.).
Given a sentence, determine the part of speech (POS) for each word. Many words, especially common ones, can serve as multiple parts of speech. For example, "book" can be a noun ("the book on the table") or verb ("to book a flight");
The goal of terminology extraction is to automatically extract relevant terms from a given corpus.
What is the computational meaning of individual words in context?
Automatically translate text from one human language to another. This is one of the most difficult problems, and is a member of a class of problems colloquially termed "AI-complete", i.e. requiring all of the different types of knowledge that humans possess (grammar, semantics, facts about the real world, etc.) to solve properly.
Given a stream of text, determine which items in the text map to proper names, such as people or places, and what the type of each such name is (e.g. person, location, organization).
Convert information from computer databases or semantic intents into readable human language.
Given an image representing printed text, determine the corresponding text.
Given a human-language question, determine its answer. Typical questions have a specific right answer (such as "What is the capital of Canada?"), but sometimes open-ended questions are also considered (such as "What is the meaning of life?").
Given a chunk of text, identify the relationships among named entities (e.g. who is married to whom).
Extract subjective information usually from a set of documents, often using online reviews to determine "polarity" about specific objects. It is especially useful for identifying trends of public opinion in social media, for marketing.
Given a chunk of text, separate it into segments each of which is devoted to a topic, and identify the topic of the segment.
Many words have more than one meaning; we have to select the meaning which makes the most sense in context. For this problem, we are typically given a list of words and associated word senses, e.g. from a dictionary or an online resource such as WordNet. Discourse
Produce a readable summary of a chunk of text. Often used to provide summaries of the text of a known type, such as research papers, articles in the financial section of a newspaper. Two types of summarization 1. Abstract 2. Extract
Measure the similarity between two chunks of data
Information extraction is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents. In most of the cases this activity concerns processing human language texts by means of natural language processing (NLP).
Given a sound clip of a person or people speaking, determine the textual representation of the speech. In natural speech there are hardly any pauses between successive words, and thus speech segmentation is a necessary subtask of speech recognition (see below).Also, given that words in the same language are spoken by people with different accents, the speech recognition software must be able to recognize the wide variety of input as being identical to each other in terms of its textual equivalent.
Given a sound clip of a person or people speaking, separate it into words. A subtask of speech recognition and typically grouped with it.
partitioning an input audio stream into homogeneous segments according to the speaker identity.
Given a text, transform those units and produce a spoken representation. Text-to-speech can be used to aid the visually impaired.