# Lab 01: Algorithm Design and Analysis
ISC 4221<br>
Due September 11th, 2019<br>
Connor Poetzinger<br> 

## Introduction

This lab introduces brute force algorithms. I implemented two sorting algorithms, selection sort and bubble sort, to sort a real one-dimensional array in ascending order. I then apply the selection sorting algorithm to a problem to determine the closest store location from a hypothetical caller's latitude and longitude location. The stores distance are calculated via the Haversine formula and the resulting list of stores, city, and distance are then sorted by distance in ascending order.

## 1. Selection Sort

This algorithm takes in an array of *n* real numbers to be sorted and outputs the sorted array in ascending order as well as its positional index vector. The algorithm divides the input list into two parts: the sublist of items already sorted, which is  built up from left to right at the front of the list, and the sublist of items remaining to be sorted that occupy the rest of the list. The algorithm proceeds by finding the smallest element in the unsorted sublist, swapping it with the leftmost unsorted element, and moving the sublist boundaries one element to the right.

In [1]:
#Import modules 
import numpy as np
"""
Input: Numpy array of random real numbers from 0 to 100
Output: Sorted array and index vector

Goal: Take current element and swap it with the smallest element on its right 
"""

def selectionSort(A):
    #use numpy argsort to return indicies that would sort the array 
    indx = np.argsort(A)
    #Traverse through numpy array 
    for i in range(len(A)):
        #initial minimum location 
        min_loc = i
        #find location of smallest element on right 
        for j in range(i + 1, len(A)):
            #Check if number to the right is smaller then minimum location
            if A[j] < A[min_loc]:
                min_loc = j
        #Within the first for loop swap the minimum location with first element 
        A[min_loc], A[i] = A[i], A[min_loc]
    
    return A, indx

In [5]:
A = np.random.rand(26) * 100
print("Unsorted list or random floats from 0-100\n\n", A)

Unsorted list or random floats from 0-100

 [85.32501348 79.01238477 90.50273441 77.57919042 46.11591656 24.64842269
 48.83440491 87.63569242 40.53847143 94.45146099 21.85870201 63.3207343
  3.77130625 60.58723145 73.42841579 32.12889231  3.50014459 86.50772699
 48.76288708 14.55842927 28.95660332 45.01389282  4.52603508 60.82367533
 83.29754109 16.41267191]


In [6]:
B, indxd = selectionSort(A)
print("Sorted list of random floats from 0-100 and its position indicies\n\n", 
      B, "\n\n", indxd)

Sorted list of random floats from 0-100 and its position indicies

 [ 3.50014459  3.77130625  4.52603508 14.55842927 16.41267191 21.85870201
 24.64842269 28.95660332 32.12889231 40.53847143 45.01389282 46.11591656
 48.76288708 48.83440491 60.58723145 60.82367533 63.3207343  73.42841579
 77.57919042 79.01238477 83.29754109 85.32501348 86.50772699 87.63569242
 90.50273441 94.45146099] 

 [16 12 22 19 25 10  5 20 15  8 21  4 18  6 13 23 11 14  3  1 24  0 17  7
  2  9]


Selection sort is noted for its simplicity, it has O(n<sup>2</sup>) time complexity.

## 2. Bubble Sort

This algorithm takes in an array of *n* real numbers to be sorted and outputs the the sorted array in ascending order as well as its positional index vector. The algorithm proceeds by repeatedly sweeping through the list, compares adjacent elements and swaps them if they are in the wrong order. The sweep through the list is repeated until the list is sorted. 

In [4]:
#Import modules 
import numpy as np
"""
Input: Numpy array of random real numbers from 0 to 100
Output: Sorted array and index vector

Goal: Move left to right, compare consedcutive elements and switch them if 
they are out of order. Continue until no swaps are made through an entire
sweep.
"""

def bubbleSort(A):
    #use numpy argsort to return indicies that would sort the array 
    indx = np.argsort(A)
    #Traverse the numpy array
    for i in range(len(A)):
        #At each sweep compare the current j with the next value 
        #Use length minus 1 since we are comparing the current value 
        #with the next 
        for j in range((len(A) - 1) - i):
            #Swap positions if element found is greater than the next 
            #element. Largest nums bubble to the back 
            if A[j] > A[j + 1]:
                #Current element moves to the back 
                A[j], A[j + 1] = A[j + 1], A[j]
    
    return A, indx

In [5]:
A = np.random.rand(25) * 100
print("Unsorted list or random floats from 0-100\n\n", A)

Unsorted list or random floats from 0-100

 [72.56893187 76.16464016 29.82701629 25.11089047 88.5028215  43.43356996
 49.1445192  70.33294691 38.50901398 49.55203914 34.43311064  1.69675518
 96.73588528 64.46356414 58.62491947 88.58874741 82.18190962 90.73361751
 50.07136809 65.86577621 87.28108278 31.49009725 55.43346748 52.52842143
 51.7450819 ]


In [6]:
B, indxd = bubbleSort(A)
print("Sorted list of random floats from 0-100 and its position indicies\n\n",
      B, "\n\n", indxd)

Sorted list of random floats from 0-100 and its position indicies

 [ 1.69675518 25.11089047 29.82701629 31.49009725 34.43311064 38.50901398
 43.43356996 49.1445192  49.55203914 50.07136809 51.7450819  52.52842143
 55.43346748 58.62491947 64.46356414 65.86577621 70.33294691 72.56893187
 76.16464016 82.18190962 87.28108278 88.5028215  88.58874741 90.73361751
 96.73588528] 

 [11  3  2 21 10  8  5  6  9 18 24 23 22 14 13 19  7  0  1 16 20  4 15 17
 12]


Bubble sort is noted for its simplicity however, this algorithm is noticeably slower than selection sort. Bubble sort has a complexity of O(n<sup>2</sup>) which is the same as selection sort, but since bubble sort takes multiple sweeps of the list to sort the array, it is considered inferior to selection sort.

## 3. A Basic Application of Sorting

Using the Haversine algorithm, find the distance between the store distance and the user inputed longitude and latitude. The example latitude and longitude I use is 82 W 29 N, approximately near Ocala, Florida.

In [5]:
import numpy as np 
import pandas as pd 

def haversin(data):
    
    """
    This function reads in user latitude and logitude to simulate logging customer
    call locations. The user's long and lat are then converted to radians along 
    with other pre-defined longs and lats imported from a text file and save to a 
    pandas dataframe. The function then computes the distance from the caller to 
    the the stores in the dataframe using the haversine formula. The distances are
    then sorted in ascending order and printed out to provide the customer with 
    a list of closest stores, and how many miles to the store. 
    """
    
    #Read in user long and lat 
    #must transform string values to float 
    lon1 = float(input("Enter longitude: "))
    lat1 = float(input("Enter latitude: "))
    
    #Radius of earth from the equator in miles (found on google)
    R = 3963.0
    
    #assign user lat and lon and datatable lat and long to variables
    #I use the in-built function map to assign the numpy function 
    #np.radians to the degree
    #lats and long to transform degree into radians 
    lat1, lon1, lat2, lon2 = map(np.radians, 
                                 [lat1, lon1, data.latitude, data.longitude])
    
    #calculate the distance between the lats and longs
    lon_dist = lon2 - lon1
    lat_dist = lat2 - lat1
    
    #apply haversine formula 
    #np cos and sin provide for faster calculations 
    c = 2 * R * np.arcsin(np.sqrt((np.sin(lat_dist)/2)**2 + 
                                  np.cos(lat1) * np.cos(lat2) * 
                                  np.sin((lon_dist)/2)**2 ))
    
    #prompt user for the info provided 
    print("\nBelow are the closest stores from your location in ascending order\n")
    
    #apply selection sort
    sorted_result, indx = selectionSort(c)
    
    return sorted_result, indx

In [6]:
#import data table 
#use pandas to assign columns using strings 
data = pd.read_table('stores_location.dat', delim_whitespace=True, 
                     names = ('store', 'city', 'latitude', 'N', 'longitude','W'))

#call function 
sorted_dist , indx = haversin(data)

#Sort the cities and stores based on the index from selectionSort(c)
sorted_cities = [data['city'][indx[i]] for i in range(len(data))]
sorted_stores = [data['store'][indx[i]] for i in range(len(data))]

#create a tuple to package store num, city, and dist together
sorted_zip = list(zip(sorted_stores, sorted_cities, sorted_dist))

#Create output dataframe
final_sort = pd.DataFrame(sorted_zip, columns=['Store','City','Distance(m)'])
print(final_sort.to_string(index=False))

Enter longitude: 82
Enter latitude: 29

Below are the closest stores from your location in ascending order

   Store          City  Distance(m)
 store#2   Gainesville    49.170962
 store#6       Orlando    58.666064
 store#5         Tampa    77.023368
 store#4  Jacksonville    94.676263
 store#1   Tallahassee   168.646953
 store#7       Hialeah   241.000734
 store#3         Miami   248.602665


Testing the longitude and latitude coordinates for Ocala, Florida (82 W, 29 N), the resulting list is accurate. The closest city, Gainesville, is 49.17 miles away followed by Orlando at 58.66 miles away. The furthest city from the given cooridinates is Miami at is 248.60 miles away.