<a target="_blank" href="https://colab.research.google.com/github/ChuBL/Colab_Tutorial_CS579/blob/main/Colab_Tutorial_CS579.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

Special thanks to [Data Professor YouTube channel](http://youtube.com/dataprofessor),
[N46Whisper](https://github.com/Ayanaminn/N46Whisper).


<a name="0-outline"></a>
# 0. Outline

1. [Headings](#1-heading)
2. Package Usage
3. [Using R in Colab](#3-R)
4. [Data Handling](#4-data)
5. [YData Profiling](#5-ydata)
6. [Connecting to Google Drive](#6-drive)
7. [Colab Pro](#7-colabpro)



<a name="1-heading"></a>
# 1. Heading 1
[[Back to Top](#0-outline)]

This is the level 1 heading.

## 1.1 Heading 2
This is the level 2 heading.

You can click the `ᐯ` at the left of the headings to fold the sections.

### 1.1.1 Heading 3

❗Oops❗**(1/4)**

The heading for the next section is missing!

Please insert a *text cell* of **level 1 heading** named `2. Package Usage` below this cell to avoid incorrect section foldings.

Welcome to section 2, package usage.

This is the section for basic package managements.

Colab has many built-in packages.

## 2.1 Importing Packages

In [None]:
import numpy as np
import pandas as pd

In [None]:
# numpy
zeros_array = np.zeros((3, 3))
print(zeros_array)

# pandas
data = {'Name': ['John', 'Jane', 'Mike'], 'Age': [30, 25, 40]}
df = pd.DataFrame(data)
print(df)

## 2.2 Installing Packages

However, sometimes, you must install packages before importing.

In [None]:
# Try import without installation
import py3Dmol

❗Oops❗**(2/4)**

We got a <font color='red'>ModuleNotFoundError</font>!

Please insert a *code cell* below and run

`!pip install py3Dmol`

*The exclamation point tells the notebook cell to run the following command as a shell command.*

In [None]:
# Now, try import it again
import py3Dmol

## 2.3 Play with the Installed 3D Package

[Source](https://pypi.org/project/py3Dmol/)

In [None]:
# Toy a example for the installed package
view = py3Dmol.view(query='pdb:1ubq')
view.setStyle({'cartoon':{'color':'spectrum'}})
view

<a name="3-R"></a>
# 3. Using R in Colab (Try this after class)

[[Back to Top](#0-outline)]

In [None]:
# Check the python version
! python --version

Go to the menu for `Runtime` -> `Change runtime type` -> Switch the Runtime type from `Python 3` to `R`

In [None]:
%%script false --no-raise-error
# comment out the first line to activate this code block

# Check it again
! python --version

In [None]:
%%script false --no-raise-error
# comment out the first line to activate this code block

# Check the R version
version

In [None]:
%%script false --no-raise-error
# comment out the first line to activate this code block

# AI generated hello world example
# Create a string
message <- "Hello, World!"

# Split the string into individual characters
characters <- unlist(strsplit(message, ""))

# Create a data frame with these characters
df <- data.frame(Letters = characters, stringsAsFactors = FALSE)

# Use cat to print all letters on the same line
cat(df$Letters, sep = "")

<a name="4-data"></a>
# 4. Data Handling

[[Back to Top](#0-outline)]

Now, back to python runtime.

If you find troubles running the cells, go to the menu for `Runtime` -> `Restart session`

## 4.1 Preparation (Just run this section)

The following data-obtaining code is based on the [openmindat python package](https://github.com/ChuBL/OpenMindat), which requires a Mindat API key for operative data access.

This tutorial will offer a temporary key for class test usage, which will be revoked soon after the class. If you wish to have your key, please refer to https://www.mindat.org/a/how_to_get_my_mindat_api_key.

In [None]:
%%capture --no-stderr
!pip install openmindat
!pip install folium

In [None]:
%%capture --no-stderr
! gdown --id 13dNRtB9WrOtWamLYLlSoG-d8S82rxYOk

In [None]:
import yaml
import os

with open('./.apikey.yaml', 'r') as f:
    yaml_api_key = yaml.safe_load(f)['api_key']

os.environ["MINDAT_API_KEY"] = yaml_api_key

## 4.2 Dataset Obtaining

In [None]:
#@title **Query Localities of a Country**


# @markdown **【Step 1.】:**<font size="5">Select country for the query.</font>

# encoding:utf-8
country_selection = "Austria"  # @param ["Austria","Brazil", "Canada"]

# @markdown **【Step 2.】:** <font size="5">Specify keywords in the place names. Leave blank for no filter.
# @markdown </br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
# @markdown e.g, `mine`, `deposit`, `xx city`, `xx county`, `xx town`, etc.</font>
txt_info = "mine"  # @param {type:"string"}
# @markdown **【Step 3.】:** <font size="5">Specify keywords in the locality descriptions. Leave blank for no filter.
# @markdown </br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
# @markdown e.g, `mine`, `hot spring`, `tourist area`, `copper`, etc.</font>
description_info = ""  # @param {type:"string"}

from openmindat import LocalitiesRetriever

lr = LocalitiesRetriever()
lr.country(country_selection).txt(txt_info).description(description_info)
lr.saveto('/content/mindat_data')

In [None]:
import json
with open('FILE_PATH', 'r') as f:
    data = json.load(f)

❗Oops❗**(3/4)**

We got a <font color='red'>FileNotFoundError</font>!

Please find and copy the file path in the 📁(left column) and paste it to the code below.

In [None]:
with open('PASTE_FILE_PATH_HERE', 'r') as f:
    data = json.load(f)
print(data['results'][0])

## 4.3 Visualiaztion

In [None]:
#@title **Plot the Queried Localities**


# @markdown **【Step 1.】:** <font size="5">Select country for the query.</font>

# encoding:utf-8
country_selection = "Brazil"  # @param ["Austria","Brazil", "Canada"]

# @markdown **【Step 2.】:** <font size="5">Specify keywords in the place names. Leave blank for no filter.
# @markdown </br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
# @markdown e.g, `mine`, `deposit`, `xx city`, `xx county`, `xx town`, etc.</font>
txt_info = "mine"  # @param {type:"string"}
# @markdown **【Step 3.】:** <font size="5">Specify keywords in the locality descriptions. Leave blank for no filter.
# @markdown </br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
# @markdown e.g, `mine`, `hot spring`, `tourist area`, `copper`, etc.</font>
description_info = ""  # @param {type:"string"}
# @markdown **【Step 4.】:** <font size="5">Choose the visualization approach.</font>
visualization_selection = "heat map" # @param ["", "pop up","heat map"]


from openmindat import LocalitiesRetriever
import json
import folium
from folium.plugins import HeatMap

lr = LocalitiesRetriever()
lr.country(country_selection).txt(txt_info).description(description_info)
lr.saveto('/content/mindat_data')

# Initialize sums and counters
lat_sum = 0
lon_sum = 0
count = 0

with open('/content/mindat_data/localities.json', 'r') as f:
    data = json.load(f)

# Sum up all latitudes and longitudes
for item in data['results']:
    lat = item['latitude']
    lon = item['longitude']
    # Filter out the (0,0) coordinate and other potentially erroneous coordinates
    if lat != 0.0 and lon != 0.0:
        lat_sum += lat
        lon_sum += lon
        count += 1

# Calculate the average latitude and longitude (the centroid)
if count > 0:
    center_lat = lat_sum / count
    center_lon = lon_sum / count
else:
    center_lat, center_lon = 38, 77  # Default to Washington, D.C. if no valid data points

# Create a map centered around the calculated centroid
map = folium.Map(location=[center_lat, center_lon], zoom_start=6)

if "pop up" == visualization_selection:
    # Add markers for each location in the JSON data
    for item in data['results']:
        lat = item['latitude']
        lon = item['longitude']
        # Filter out the (0,0) coordinate
        id = item.get('id')
        # print(type(id))
        txt = item.get('txt', 'No txt provided')  # Default if no description is provided
        url = f'https://www.mindat.org/loc-{id}.html'
        # popup_info = f"<strong>{id}</strong><br>{txt}"
        popup_info = folium.Popup(f"<div style='width:200px; font-size:16px;'><strong>ID:</strong> {id}<br><strong>Description:</strong> {txt}<br><strong>URL:</strong> <a href='{url}' target='_blank'>{url}</a></div>",
                            max_width=265)
        if lat != 0.0 or lon != 0.0:
            folium.Marker(
                location=[lat, lon],
                popup=popup_info,
                icon=folium.Icon(color='blue', icon='info-sign')
            ).add_to(map)
elif "heat map" == visualization_selection:
    # Add markers for each location in the JSON data
    for item in data['results']:
        lat = item['latitude']
        lon = item['longitude']
        # Filter out the (0,0) coordinate

    # Add a heat map layer to the map
    heat_map_data = [
        (item['latitude'], item['longitude']) for item in data['results']
        if item['latitude'] != 0.0 and item['longitude'] != 0.0
    ]

    HeatMap(heat_map_data).add_to(map)
else:
    raise ValueError("Please select a visualization approach!")
map


<a name="5-ydata"></a>
# 5. YData Profiling

[[Back to Top](#0-outline)]

[Source](https://docs.profiling.ydata.ai/latest/)

In [None]:
%%capture --no-stderr
!pip install ydata-profiling

In [None]:
import numpy as np
import pandas as pd
from ydata_profiling import ProfileReport

In [None]:
df = pd.DataFrame(
    np.random.rand(100, 5),
    columns=['a', 'b', 'c', 'd', 'e']
)

In [None]:
df

In [None]:
profile = ProfileReport(df, title='Pandas Profiling Report', html={'style':{'full_width':False}})

In [None]:
profile.to_widgets()

In [None]:
profile.to_notebook_iframe()

In [None]:
profile.to_file(output_file="REPORT.html")

<a name="6-drive"></a>
# 6. Connecting to Google Drive

[[Back to Top](#0-outline)]

❗Oops❗**(4/4)**

I forgot the code to mount Google Drive on Colab!

Please try to find it in the `<>` button in the left column.

Search the code snippets using keyword:

`drive`

Then, insert the code snippet from `Mounting Google Drive in your VM`.

<a name="7-colabpro"></a>
# 7. Colab Pro

[[Back to Top](#0-outline)]

https://colab.research.google.com/signup