# Final Exam

## Question 1 
Define a function `insert_block` to insert a block of values in a matrix,

```julia
A = fill(0, 9, 9)
insert_block(A, 3, 5, 2)
```

so that the above code would yield:

`9×9 Array{Int64,2}:
 0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0
 0  0  0  0  2  2  2  0  0
 0  0  0  0  2  2  2  0  0
 0  0  0  0  2  2  2  0  0
 0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0`

In [1]:
function insert_block(A,row,col,n)
    if (row+col) > size(A,1)+1 || (2*row)-1 > size(A,1)
        print("Dimensions don't agree, A is a $(size(A,1))x$(size(A,2)) matrix.\nTrying to insert $(row)x$(row) matrix from column 5")
        return
    end
    
    b = fill(n,row,row)
    A[row:(2*row)-1,col:col+row-1] = b
    return A
    
end

A = fill(0,9,9)
insert_block(A,3,5,2)

9×9 Array{Int64,2}:
 0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0
 0  0  0  0  2  2  2  0  0
 0  0  0  0  2  2  2  0  0
 0  0  0  0  2  2  2  0  0
 0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0  0

## Question 2

### Create a Caesar ciphers

A caesar cipher is an encryption scheme that shifts all letters in the alphabet by some specified offset to other letters in the alphabet.

For example, a shift of 1 would turn the letter "A" into the letter "B" and the letter "M" to the letter "N".

### Goal

We want to add a method to the `+` operator such that we can add together a string and an integer shift to encrypt a message. For example,

```julia
4 + "hello" == "lipps"
```

### Test it out

Once you think you have it working, try to decrypt the following string by adding a shift of -7.
```julia
"Kv'uv{'tlkksl'pu'{ol'hmmhpyz'vm'kyhnvuz'mvy'\u80v|'hyl'jy|ujo\u80'huk'{hz{l'nvvk'~p{o'rl{jo|w5"
```

## Below is the step by step guide to solve the Caesar ciphers problem

### Step 1

First, we want a way to convert between characters and integers. Actually, under the hood, all of our characters are being represented as numbers via their *ASCII representation*.

You can start to get a feel for how this works by running the following lines of code.

```julia
convert(Int, 'a')
convert(Int, 'b')
convert(Char, 97)
convert(Char, 98)
```

In [2]:
convert(Int, 'a')
convert(Int, 'b')
convert(Char, 97)
convert(Char, 98)

'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)

What happens when you try to add an integer to a character? (Note that the difference between `Char`s and `String`s is important here!)

In [3]:
'a'+1

'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)

When we treat a string elementwise, what is the type of (`typeof`) each element?

In [4]:
myString = "Hello"
for i in 1:length(myString)
    println(myString[i],"-",typeof(myString[i]))
end

H-Char
e-Char
l-Char
l-Char
o-Char


### Step 2
Try to write a function called `caesar(shift, stringin)` that encodes its input string, `stringin`, by shifting all letters in the alphabet by `shift`.

One way to do this is to use the `map` or `broadcast` function!

In [5]:
function caesar(shift, stringin)
    str = []
    for i in 1:length(stringin)
        str = push!(str,shift+stringin[i])
    end
    return join(str)
end

caesar (generic function with 1 method)

If you think you have this working, try out
```julia
caesar(-4, "lipps")
```

In [6]:
caesar(-4, "lipps")

"hello"

### Step 3
We want to extend the `+` operator to include a way to apply this cipher.

The `+` operator lives in a place called "Base". Everything that lives in Base is accessible to us as users by default, but we need a special incantation to modify the things that live in Base. If we want to modify `+`, our incantation is

```julia
import Base: +
```
To perform string addition, write a method for `+` like this

+(x::String, y::String) = string(x, y)

In [7]:
import Base: +
+(x::String, y::String) = string(x, y)

+ (generic function with 185 methods)

And now that you've extended `+` once, let's add another method for `+` that calls the `caesar` function we've written.

In [8]:
function caesar(shift, stringin)
    str = ""
    for i = 1:length(stringin)
        str = str+string(shift+stringin[nextind(stringin, 0, i)])
    end
    return str
end

caesar (generic function with 1 method)

Test your final answer.

In [9]:
caesar(-4, "lipps")

"hello"

In [10]:
stringin="Kv'uv{'tlkksl'pu'{ol'hmmhpyz'vm'kyhnvuz'mvy'\u80v|'hyl'jy|ujo\u80'huk'{hz{l'nvvk'~p{o'rl{jo|w5"
caesar(-7, stringin)

"Do not meddle in the affairs of dragons for you are crunchy and taste good with ketchup."

## Question 3

Download the `test.csv` file from moddle. This data has passenger details of train travellers. Read the data and tidy it (handle the missing values). Calculate the average age for Male passengers and Female passengers.

In [13]:
using CSV
using DataFrames
using Statistics

PassengersData = CSV.read("test.csv",header=true);
df = DataFrame(Sex = PassengersData.Sex, Age = PassengersData.Age)
PassengersData_clean = dropmissing(df);

maleData = PassengersData_clean[findall(x-> x==lowercase("MALE"),PassengersData_clean.Sex),:]
println("Average male age is ",mean(maleData.Age)," years")

femaleData = PassengersData_clean[findall(x-> x==lowercase("FEMALE"),PassengersData_clean.Sex),:]
println("Average female age is ",mean(femaleData.Age)," years")

┌ Info: Precompiling DataFrames [a93c6f00-e57d-5684-b7b6-d8193f3e46c0]
└ @ Base loading.jl:1278


LoadError: [91mArgumentError: provide a valid sink argument, like `using DataFrames; CSV.read(source, DataFrame)`[39m

## Question 4   

Download the file RestaurantData.zip file containing the data for this project assignment from the Moodle. Unzip the file in a directory that will serve as your working directory. When you start up Julia, make sure to change your working directory to the directory where you unzipped the data. 

The data for this project came from Kaggle where somebody had used the Zomato API to scrape the data. Zomato API Analysis is useful for foodies who want to taste the best cuisines of every part of the world which lies in their budget. 

For more information on Zomato API and Zomato API key you could visit https://developers.zomato.com/api#headline1 and https://developers.zomato.com/documentation
Data Fetching: Data had been originally collected from the Zomato API in the form of .json files (raw data). The collected data has been stored in the Comma Separated Value file zomato.csv. 

Each restaurant in the dataset is uniquely identified by its Restaurant Id. Every Restaurant contains the following variables: <br>
•	Restaurant Id: Unique id of every restaurant across various cities of the world <br>
•	Restaurant Name: Name of the restaurant <br>
•	Country Code: Country in which restaurant is located <br> 
•	City: City in which restaurant is located <br>
•	Address: Address of the restaurant <br>
•	Locality: Location in the city <br>
•	Locality Verbose: Detailed description of the locality <br>
•	Longitude: Longitude coordinate of the restaurant's location <br>
•	Latitude: Latitude coordinate of the restaurant's location <br>
•	Cuisines: Cuisines offered by the restaurant <br>
•	Average Cost for two: Cost for two people in different currencies 👫 <br>
•	Currency: Currency of the country <br>
•	Has Table booking: yes/no <br>
•	Has Online delivery: yes/ no <br>
•	Is delivering: yes/ no <br>
•	Switch to order menu: yes/no <br>
•	Price range: range of price of food <br>
•	Aggregate Rating: Average rating out of 5 <br>
•	Rating color: depending upon the average rating color <br>
•	Rating text: text on the basis of rating of rating <br>
•	Votes: Number of ratings casted by people

### 4.1 Plot the user rankings for the restaurants 

Read the zomato data into Julia via the `CSV.read` function and look at the first few rows.
There are many columns in this dataset. You can see how many by typing `ncol(zomato)` (you can see the number of rows with the `nrow` function). In addition, you can see the names of each column by typing `names(zomato)` (the names are also listed on the previous page.) 
Make a simple histogram of the ratings (column 18 in the zomato dataset).

In [15]:
using CSV
using Plots
gr()
zomato = CSV.read("./RestaurantData/zomato.csv")

histogram(zomato."Aggregate rating")

┌ Info: Precompiling Plots [91a5bcdd-55d7-5caf-9e0b-520d859cae80]
└ @ Base loading.jl:1278


LoadError: [91mArgumentError: provide a valid sink argument, like `using DataFrames; CSV.read(source, DataFrame)`[39m

### 4.2 Finding the 3 best restaurants in a country by price range 

Write a function called best that takes two arguments: a 3-character country code of a country and a price range (numbers 1,2,3,4). The function should read the zomato.csv file and return a data frame with the names of the restaurant and the city that have the 3 best (i.e. highest) ranking for a price range in that country. There are 4 price range levels. Countries that do not have data on a particular price range should be excluded from the set of restaurants when deciding the rankings. 
Handling ties. If there is a tie for the best restaurant for a given price range, then the restaurant names should be sorted in alphabetical order.
The function should use the following template. 

```julia
Function best(country, pricerange)
        ## Read zomato data
        ## Check that country and price range are valid
        ## Return restaurant name in that state with highest ranking
end
```

The function should check the validity of its arguments. If an invalid country value is passed to best, the function should throw an error via the stop function with the exact message “invalid country”. If an invalid price range value is passed to best, the function should throw an error via the stop function with the exact message “invalid price range”. 
Here is some sample output from the function (WARNING: we will run different tests and the results shown here might not actually be correct)

```julia
julia> best("USA", 1)
3 rows × 2 columns

  Restaurant Name       City                  
  String                  String                 
1 Oakwood Cafe            Dalton                
2 Rae's Coastal Cafe      Augusta               
3 Shorts Burger and Shine Cedar Rapids/Iowa City

julia> best("IND", 3)
3 rows × 2 columns
  Restaurant Name         City   
  String                  String                 
1 AB's - Absolute Barbecues Chennai
2 Sagar Gaire Fast Food     Bhopal 
3 Sheroes Hangout           Agra

julia> best("TGH", 4)
Error in best("TGH", 4): invalid country

julia> best("BRA", 5)
Error in best("BRA", 5) : invalid price range
```

In [13]:
function best(country,pricerange,top)
    #================================================
    best(country, pricerange, top)
    country: Country name:
    pricerange: Price range from 1 to 4
    top: number of top restaurants to display in the 
    country for the perticular pricerange
    ================================================#
    
    zomato = CSV.read("./RestaurantData/zomato.csv",header=true)
    
    countryCodes = CSV.read("./RestaurantData/Country-Code.csv",header=true)    
        
    if pricerange > maximum(zomato."Price range")||pricerange < minimum(zomato."Price range")
        println("Error: invalid price range. Price range is between 1 and 4 \n\n")
        return
    end
        
    quiriedCode = countryCodes[findall(x-> x==uppercase(country),countryCodes."3lettercode"),1]
    
    if isempty(quiriedCode) == 1
        println("Error: invalid country \n\n")
        return
    end
    
    filteredRestaurants = zomato[findall(y-> y==quiriedCode[1],zomato."Country Code"),:];
    filteredRestaurants = filteredRestaurants[findall(pr-> pr==pricerange,
            filteredRestaurants."Price range"),:]
    
    
    sort!(filteredRestaurants, :"Restaurant Name")
    sort!(filteredRestaurants, :"Aggregate rating",rev=true)
        
    DF = DataFrame(Restaurant = filteredRestaurants."Restaurant Name"[1:top], 
        City = filteredRestaurants."City"[1:top],
        Locality = filteredRestaurants."Locality"[1:top],
        Rating = filteredRestaurants."Aggregate rating"[1:top],
        PriceRange = filteredRestaurants."Price range"[1:top])
    
    println(DF,"\n\n")
    
end



best (generic function with 1 method)

In [14]:
country = "USA"
pricerange = 4
top = 5

best(country,pricerange,top)

country = "bra"
pricerange = 3
best(country,pricerange,top)

country = "inD"
pricerange = 1
best(country,pricerange,top)

5×5 DataFrame
│ Row │ Restaurant              │ City           │ Locality          │ Rating  │ PriceRange │
│     │ [90mString[39m                  │ [90mString[39m         │ [90mString[39m            │ [90mFloat64[39m │ [90mInt64[39m      │
├─────┼─────────────────────────┼────────────────┼───────────────────┼─────────┼────────────┤
│ 1   │ Mama's Fish House       │ Rest of Hawaii │ Paia              │ 4.9     │ 4          │
│ 2   │ Bern's Steak House      │ Tampa Bay      │ Hyde Park         │ 4.7     │ 4          │
│ 3   │ Texas de Brazil         │ Orlando        │ I-Drive/Universal │ 4.6     │ 4          │
│ 4   │ Duke's Waikiki          │ Rest of Hawaii │ Waikiki           │ 4.4     │ 4          │
│ 5   │ Maggiano's Little Italy │ Orlando        │ I-Drive/Universal │ 4.4     │ 4          │


5×5 DataFrame
│ Row │ Restaurant        │ City             │ Locality                  │ Rating  │ PriceRange │
│     │ [90mString[39m            │ [90mString[39m           │ [9