# __Machine Learning - Logistic Regression.__
Date : 29, March, 2024.

- Logistic regression aims to solve classification problems. It does this by predicting categorical outcomes, unlike linear regression that predicts a continuous outcome.

- In the simplest case there are two outcomes, which is called binomial, an example of which is predicting if a tumor is malignant or benign.

- Other cases have more than two outcomes to classify, in this case it is called multinomial. A common example for multinomial logistic regression would be predicting the class of an iris flower between 3 different species.

Here we will be using basic logistic regression to predict a binomial variable. This means it has only two possible outcomes.

- __Example__ : Is tumor is Concerous or not?

In [7]:
import numpy 
from sklearn import linear_model 

X= numpy.array([3.75, 2.44, 2.09, 0.14, 1.72, 1.65, 4.92, 4.37, 4.96, 4.52, 3.69, 5.88]).reshape(-1, 1)
y= numpy.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])

logr= linear_model.LogisticRegression()
logr.fit(X, y)

# Predict if tumor is concerous where the size is 3.46 CM : 
size=eval(input("Enter size of Tumour :"))
print(size)
predict= logr.predict(numpy.array([size]).reshape(-1, 1))
if predict== [0]:
    print(f"NO tummor is not cancerous !!")
else :
    print(f"Unfortunately Tumor is cancerous !!!")

3.46
NO tummor is not cancerous !!


__Code Explanation :__ 

- X: This array contains the sizes of tumors. It's reshaped to a column vector using .reshape(-1, 1).    
- y: Represents whether or not the tumor is cancerous (0 for "No", 1 for "Yes").

- From the sklearn module we will use the LogisticRegression() method to create a logistic regression object.
- This object has a method called __fit()__ that takes the independent and dependent values as parameters and fills the regression object with data that describes the relationship:

__Coefficient :__

- In logistic regression, the coefficient tells us how much the likelihood of the outcome (such as "cancerous tumor" or "not cancerous tumor") changes for every one-unit increase in the predictor variable X.

- Exampele : For every 1 mm size cancer increase how much time ?

In [8]:
import numpy 
from sklearn import linear_model 

X= numpy.array([3.75, 2.44, 2.09, 0.14, 1.72, 1.65, 4.92, 4.37, 4.96, 4.52, 3.69, 5.88]).reshape(-1, 1)
y= numpy.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])

logr= linear_model.LogisticRegression()
logr.fit(X, y)

log_odd= logr.coef_
odds= numpy.exp(log_odd)      # covert each value from normal scale to exponential scale.

print(odds)


[[4.06336674]]


__NOTE :__ This tells us that for every additional millimeter in tumor size, the likelihood (or odds) of the tumor being cancerous increases 4X compared to its previous odds.

- "__likelihood__" refers to the probability of observing a particular outcome given certain conditions or data.

__Probability.__

- The coefficient and intercept values can be used to find the probability that each tumor is cancerous.

- Create a function that uses the model's coefficient and intercept values to return a new value. This new value represents probability that the given observation is a tumor:

- Code : 

In [5]:
# Importing necessary libraries
import numpy
from sklearn import linear_model

# Sample tumor sizes (in centimeters) as predictor variables
tumor_sizes = numpy.array([3.75, 2.44, 2.09, 0.14, 1.72, 1.65, 4.92, 4.37, 4.96, 4.52, 3.69, 5.88])

# Reshaping the tumor sizes array to fit logistic regression format
# Each tumor size is represented as a single feature (reshaped as a column vector)
X = tumor_sizes.reshape(-1, 1)

# Binary labels indicating whether the tumor is cancerous (1) or not (0)
y = numpy.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])

# Creating a logistic regression model
logistic_regression_model = linear_model.LogisticRegression()

# Fitting the model to the data
logistic_regression_model.fit(X, y)

# Defining a function to convert logit to probability
def logit_to_probability(logr, X):
    # Calculating log-odds using the logistic regression coefficients and intercept
    log_odds = logr.coef_ * X + logr.intercept_
    
    # Calculating odds from log-odds using the exponential function
    odds = numpy.exp(log_odds)
    
    # Calculating probability from odds using the logistic function
    probability = odds / (1 + odds)
    
    return probability

# Printing the probabilities of tumor being cancerous
print("Probabilities of tumor being cancerous:")
print(logit_to_probability(logistic_regression_model, X))


Probabilities of tumor being cancerous:
[[0.59934789]
 [0.19249379]
 [0.12735048]
 [0.0093915 ]
 [0.07992729]
 [0.07300121]
 [0.88524882]
 [0.78108425]
 [0.89082375]
 [0.81491941]
 [0.57898985]
 [0.96736006]]



__Results Explained:__

0.59934789...  and all output digits show probility of respective size of tumour 

- The probability that a tumor with the size 3.78cm is cancerous is approximately 60%.
- The probability that a tumor with the size 2.44cm is cancerous is approximately 19%.
- The probability that a tumor with the size 2.09cm is cancerous is approximately 13%.
- The probability that a tumor with the size 0.14cm is cancerous is approximately 1%.
- The probability that a tumor with the size 1.72cm is cancerous is approximately 8%.
- The probability that a tumor with the size 1.65cm is cancerous is approximately 7%.
- The probability that a tumor with the size 4.92cm is cancerous is approximately 89%.
- The probability that a tumor with the size 4.37cm is cancerous is approximately 78%.
- The probability that a tumor with the size 4.96cm is cancerous is approximately 89%.
- The probability that a tumor with the size 4.52cm is cancerous is approximately 81%.
- The probability that a tumor with the size 3.69cm is cancerous is approximately 58%.
- The probability that a tumor with the size 5.88cm is cancerous is approximately 97%.