## Web Application

### Prepare the data
1. Read data from spreadsheet

In [5]:
# Import library and read data
import pandas as pd 
import numpy as np 

ufos = pd.read_csv('./data/ufos.csv')
ufos.head()

Unnamed: 0,datetime,city,state,country,shape,duration (seconds),duration (hours/min),comments,date posted,latitude,longitude
0,10/10/1949 20:30,san marcos,tx,us,cylinder,2700.0,45 minutes,This event took place in early fall around 194...,4/27/2004,29.883056,-97.941111
1,10/10/1949 21:00,lackland afb,tx,,light,7200.0,1-2 hrs,1949 Lackland AFB&#44 TX. Lights racing acros...,12/16/2005,29.38421,-98.581082
2,10/10/1955 17:00,chester (uk/england),,gb,circle,20.0,20 seconds,Green/Orange circular disc over Chester&#44 En...,1/21/2008,53.2,-2.916667
3,10/10/1956 21:00,edna,tx,us,circle,20.0,1/2 hour,My older brother and twin sister were leaving ...,1/17/2004,28.978333,-96.645833
4,10/10/1960 20:00,kaneohe,hi,us,light,900.0,15 minutes,AS a Marine 1st Lt. flying an FJ4B fighter/att...,1/22/2004,21.418056,-157.803611


2. Convert the data to a dataframe

In [6]:
# Create dataframe based on data
ufos = pd.DataFrame({
    'Seconds': ufos['duration (seconds)'],
    'Country': ufos['country'],
    'Latitude': ufos['latitude'],
    'Longitude': ufos['longitude']
})
ufos.Country.unique()

# Drop any null values
ufos.dropna(inplace=True)
ufos = ufos[(ufos['Seconds'] >= 1) & (ufos['Seconds'] <= 60)]

3. Convert text values to numbers

In [7]:
# Encode data alphabetically
from sklearn.preprocessing import LabelEncoder

ufos['Country'] = LabelEncoder().fit_transform(ufos['Country'])
print(ufos)

       Seconds  Country   Latitude   Longitude
2         20.0        3  53.200000   -2.916667
3         20.0        4  28.978333  -96.645833
14        30.0        4  35.823889  -80.253611
23        60.0        4  45.582778 -122.352222
24         3.0        3  51.783333   -0.783333
...        ...      ...        ...         ...
80320     60.0        4  33.209722  -87.569167
80321      3.0        4  36.529722  -87.359444
80323     60.0        4  29.651389  -82.325000
80326     20.0        4  34.101389  -84.519444
80330      5.0        4  38.901111  -77.265556

[25863 rows x 4 columns]


### Build your model
Train a model by dividing the data into the training and testing group.

4. Select the features to train on X vector

In [8]:
# Use sklearn function to divide the data
from sklearn.model_selection import train_test_split

selected_features = ['Seconds', 'Latitude', 'Longitude']
X = ufos[selected_features]
Y = ufos['Country']

X_train, X_test, Y_train, Y_test  = train_test_split(X, Y, test_size=0.2, random_state=0)

5. Train the model using logitstic regression

In [9]:
# Train the model
from sklearn.metrics import accuracy_score, classification_report
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
model.fit(X_train, Y_train)
predictions = model.predict(X_test)

print(classification_report(Y_test, predictions))
print('Accuracy: ', accuracy_score(Y_test, predictions))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        41
           1       0.91      0.19      0.32       250
           2       1.00      0.88      0.93         8
           3       0.99      1.00      1.00       131
           4       0.96      1.00      0.98      4743

    accuracy                           0.96      5173
   macro avg       0.97      0.81      0.84      5173
weighted avg       0.96      0.96      0.95      5173

Accuracy:  0.9597912236613184


STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


### Get the predictions
Use `pickle` library to make predictions based on input values.

6. Make predictions

In [10]:
# Make predictions
import pickle
model_filename = 'ufo-model.pkl'
pickle.dump(model, open(model_filename, 'wb'))

model = pickle.load(open('ufo-model.pkl', 'rb'))
print(model.predict([[50, 44, -12]]))

[1]




### Build a Flask App
Build a Flask app to call your model and return the results.