# 1. Introduction to Multinomial Logistic Regression

Logistic regression has been used in the field of biological research since the early twentieth century. Then it began to be used in many social sciences. Logistic regression is applicable when the dependent variable (target value) is categorical.

For example, we need to predict:

- whether the email is spam (1) or not (0);
- whether the tumor is malignant (1) or benign (0).

$\boldsymbol Multinomial logistic regression $ is a statistical method used for classification problems where the outcome can take on more than two categories. It's an extension of binary logistic regression. The goal is to model the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables.

Let's look at the following examples:

Example Scripts

- Classification of Texts: Determining the topic of the text (for example, sports, politics, technology, art).

- Medical Diagnosis: Classification of the type of disease (eg, infectious, inflammatory, genetic, metabolic).

In each of these scenarios, the outcome is multiple categories, and multinomial logistic regression can be used to predict the probability of each category.

### Some Picture


##### Data Structure  
Let's imagine a data set $\mathcal D$, where each element consists of a pair of feature vectors $\boldsymbol x_i$ and a class label $y_i$:

$$
\mathcal{D} = \{(\boldsymbol{x}_i, y_i)\}_{i=1}^n \quad \text{where } \boldsymbol{x}_i \in \mathbb{R}
$$

Здесь $\mathcal Y$ — это множество возможных категорий.

##### Probability Prediction
Multinomial logistic regression predicts the probability of membership in each category:

The predicted probability vector $\boldsymbol{\hat{y}}$ is defined as:

$$
\boldsymbol{\hat{y}} = (p_1, \ldots, p_K), \quad \text{where } p_k > 0 \text{ and } \sum_{k=1}^K p_k = 1.
$$


##### Calculating Logits and Softmax Transformation
Logits are calculated as a linear combination of the input features and then converted to probabilities using the softmax function:

The logits $\boldsymbol{z}$ are calculated as a linear combination of the input features $\boldsymbol{x}$ and the weights $\boldsymbol{w}_k$ for each class, and the predicted probability vector $\boldsymbol{\hat{y}}$ is obtained through the softmax function:

$$
z_k = \boldsymbol{x}^\top \boldsymbol{w}_k, \quad \boldsymbol{\hat{y}} = \text{Softmax}(\boldsymbol{z}) = \left( \frac{e^{z_1}}{\sum_{k=1}^K e^{z_k}}, \ldots , \frac{e^{z_K}}{\sum_{k=1}^K e^{z_k}} \right)
$$


##### Class Selection
The predicted class is given by the argument that maximizes the probabilities:

$$
\text{Predicted class} = \arg\max_{1 \leq k \leq K} p_k
$$

##### Model parameters
The weight matrix $\boldsymbol{W}$ is defined as:

$$
\boldsymbol{W} = [\boldsymbol{w}_1 \ldots \boldsymbol{w}_K]
$$

### Quiz: Basic Concepts of Multinomial Logistic Regression


In [6]:
import ipywidgets as widgets
from IPython.display import display, clear_output

questions = [
    {
        "question": "1. What is Multinomial Logistic Regression primarily used for?",
        "options": ["Predicting continuous outcomes", "Classifying outcomes into one of two categories", "Estimating the relationships between continuous variables", "Classifying outcomes into one of three or more categories"],
        "answer": "Classifying outcomes into one of three or more categories"
    },
    {
        "question": "2. Which function does Multinomial Logistic Regression use to model probabilities?",
        "options": ["Linear function", "Sigmoid function", "Softmax function", "Tangent function"],
        "answer": "Softmax function"
    },
    {
        "question": "3. In Multinomial Logistic Regression, the outcome variable should be:",
        "options": ["Continuous", "Ordinal", "Nominal with two categories", "Nominal with more than two categories"],
        "answer": "Nominal with more than two categories"
    },
    {
        "question": "4. What is the main difference between Multinomial and Binary Logistic Regression?",
        "options": ["The number of predictor variables", "The number of outcome categories", "The type of algorithms used for optimization", "The type of data used for the model"],
        "answer": "The number of outcome categories"
    },
    {
        "question": "5. Which of the following is an assumption of Multinomial Logistic Regression?",
        "options": ["There must be a linear relationship between independent variables and the log odds", "Observations should be independent of each other", "Residuals need to be normally distributed", "The model requires a large number of predictor variables"],
        "answer": "Observations should be independent of each other"
    },
    {
        "question": "6. What does the 'multinomial' in Multinomial Logistic Regression refer to?",
        "options": ["Multiple linear relationships", "Multiple predictor variables", "Multiple outcome categories", "Multiple regression analyses performed simultaneously"],
        "answer": "Multiple outcome categories"
    },
    {
        "question": "7. Which method is commonly used to estimate the model parameters in Multinomial Logistic Regression?",
        "options": ["Ordinary Least Squares", "Maximum Likelihood Estimation", "Bayesian Inference", "Ridge Regression"],
        "answer": "Maximum Likelihood Estimation"
    },
    {
        "question": "8. In the context of Multinomial Logistic Regression, what does 'overfitting' refer to?",
        "options": ["When the model is too simple to capture the patterns in the data", "When the model captures the noise along with the underlying pattern in the data", "When there are too few predictor variables in the model", "When the model is trained on a very large dataset"],
        "answer": "When the model captures the noise along with the underlying pattern in the data"
    },
    {
        "question": "9. Which of the following metrics can be used to evaluate the performance of a Multinomial Logistic Regression model?",
        "options": ["R-squared value", "Mean Squared Error", "Accuracy and Confusion Matrix", "P-value of the F-test"],
        "answer": "Accuracy and Confusion Matrix"
    },
    {
        "question": "10. Can Multinomial Logistic Regression be used for binary classification problems?",
        "options": ["Yes, but it is not recommended", "No, it can only be used for multiclass problems", "Yes, it is the same as binary logistic regression", "No, because the algorithms are fundamentally different"],
        "answer": "Yes, but it is not recommended"
    }
]

# Function to create a quiz question widget
def create_quiz_question(question, options, answer):
    question_label = widgets.Label(value=question)
    options_widget = widgets.RadioButtons(options=options, description='Choices:')
    submit_button = widgets.Button(description="Submit")
    output = widgets.Output()

    def check_answer(b):
        with output:
            clear_output()
            if options_widget.value == answer:
                print("Correct!")
            else:
                print("Incorrect. The correct answer is:", answer)

    submit_button.on_click(check_answer)
    display(question_label, options_widget, submit_button, output)

# Display each question
for q in questions:
    create_quiz_question(q["question"], q["options"], q["answer"])


Label(value='1. What is Multinomial Logistic Regression primarily used for?')

RadioButtons(description='Choices:', options=('Predicting continuous outcomes', 'Classifying outcomes into one…

Button(description='Submit', style=ButtonStyle())

Output()

Label(value='2. Which function does Multinomial Logistic Regression use to model probabilities?')

RadioButtons(description='Choices:', options=('Linear function', 'Sigmoid function', 'Softmax function', 'Tang…

Button(description='Submit', style=ButtonStyle())

Output()

Label(value='3. In Multinomial Logistic Regression, the outcome variable should be:')

RadioButtons(description='Choices:', options=('Continuous', 'Ordinal', 'Nominal with two categories', 'Nominal…

Button(description='Submit', style=ButtonStyle())

Output()

Label(value='4. What is the main difference between Multinomial and Binary Logistic Regression?')

RadioButtons(description='Choices:', options=('The number of predictor variables', 'The number of outcome cate…

Button(description='Submit', style=ButtonStyle())

Output()

Label(value='5. Which of the following is an assumption of Multinomial Logistic Regression?')

RadioButtons(description='Choices:', options=('There must be a linear relationship between independent variabl…

Button(description='Submit', style=ButtonStyle())

Output()

Label(value="6. What does the 'multinomial' in Multinomial Logistic Regression refer to?")

RadioButtons(description='Choices:', options=('Multiple linear relationships', 'Multiple predictor variables',…

Button(description='Submit', style=ButtonStyle())

Output()

Label(value='7. Which method is commonly used to estimate the model parameters in Multinomial Logistic Regress…

RadioButtons(description='Choices:', options=('Ordinary Least Squares', 'Maximum Likelihood Estimation', 'Baye…

Button(description='Submit', style=ButtonStyle())

Output()

Label(value="8. In the context of Multinomial Logistic Regression, what does 'overfitting' refer to?")

RadioButtons(description='Choices:', options=('When the model is too simple to capture the patterns in the dat…

Button(description='Submit', style=ButtonStyle())

Output()

Label(value='9. Which of the following metrics can be used to evaluate the performance of a Multinomial Logist…

RadioButtons(description='Choices:', options=('R-squared value', 'Mean Squared Error', 'Accuracy and Confusion…

Button(description='Submit', style=ButtonStyle())

Output()

Label(value='10. Can Multinomial Logistic Regression be used for binary classification problems?')

RadioButtons(description='Choices:', options=('Yes, but it is not recommended', 'No, it can only be used for m…

Button(description='Submit', style=ButtonStyle())

Output()