Skip to content

A repository dedicates to students using cloud services to build their disserations.

Notifications You must be signed in to change notification settings

warestack/students

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Part 1: Create a Virtual Machine (VM)

  • Follow the instructions to create a VM.

1-network

  • Add a name.

1-network

  • Set your boot disk.

0-create-vm-boot-disk

  • Complete your VM details.

In this example we will use an Ubuntu 20.04 LTS server.

0-create-vm-details

  • Allow HTTP and HTTPS traffic.

0-create-vm-ports

Install Jupyter Notebook on GCP.

Ensure you have Python installed on your machine. Jupyter Notebook supports Python 3.3 or later. You can install Python 3 by executing:

sudo apt-get update -y && sudo apt-get upgrade -y

Step 1: Install pip

pip is a package manager for Python. It's used to install and manage Python libraries. Execute the following command to install pip for Python 3:

sudo apt install python3 python3-pip -y

Step 2: Upgrade pip (Optional)

It’s a good practice to have the latest version of pip. Upgrade pip using the following command:

pip install --upgrade pip

Step 3: Install Jupyter Notebook

Now, install Jupyter Notebook using pip:

pip3 install jupyter

Close your SSH windows, and start again.

Step 4: Start and Access Jupyter Notebook

Launch Jupyter Notebook by executing the following command:

jupyter notebook --ip='*' --NotebookApp.token='' --NotebookApp.password=''

Or

~/.local/bin/jupyter-notebook --ip='*' --NotebookApp.token='' --NotebookApp.password=''

If you like you can setup a token or password:

~/.local/bin/jupyter-notebook --ip='*' --NotebookApp.token='supersecret1234' --NotebookApp.password=''

On GCP, open port 8888 or allow access to all.

  • Click on default

1-network

  • Acces the firewall

1-network

  • Add a new rule.

1-network

  • Create the rule and save it.

1-network

  • Return to your VM instances.

1-network

  • Open your web browser and type the following URL:

1-network

http://{your-ip}:8888/

Part 2: Create a bucket

  • Follow the instructions to create a Bucket.
  • First, navigate to the Bucket space.

1-network

  • Then create a new Bucket.

1-network

  • Then, add a new Bucket name.

7-Bucket-name

  • Click the confirm button.

8-Confirm

  • Load a dataset from Kaggle (IMDB Movies Dataset).

https://www.kaggle.com/datasets/harshitshankhdhar/imdb-dataset-of-top-1000-movies-and-tv-shows?resource=download

9-movies-dataset-kaggle

  • Upload the dataset.

10-bucket-data

Part 3: Create a new dataset on BigQuery

  • Navigate to the BigQuery interface.

12-BQ-create

  • Create a new dataset

12-BQ-create

  • Let's call it kaggle_data.

13-Create-dataset

  • Create a new table.

14-Create-table

  • Set the sourceto the Google Cloud Storage.

15-Set-data-bucket

  • Then, select your dataset from your Bucket.

16-Select-data

  • Finally, set your table name to movies and the schema to auto-detect.

17-Name-schema

  • Let's run some simple queries - Click on your dataset and QUERY option in a new tab.

18-BQ-query

  • Run a simple query.

    Do not forget to add a star(*) after the SELECT statement.

    SELECT * FROM `class-380310.kaggle_data.movies` LIMIT 10;
    

19-Query1

  • Explore queries using SQL-like statements.
SELECT * FROM `class-380310.kaggle_data.movies` 
WHERE Series_Title Like "%Lord%";

20-Query2

Part 4: Access the data from Python

  • Create a new IAM policy for BigQuery

21-IAM

  • Access the service accounts.

22-Service-account-1

  • Create a new service account.

23-Service-account-create

  • Complete the service account details.

    Complete a name and a description.

24-Service-account-details

  • Navigate back to IAM and create the new principals.

25-IAM-setup-principal

  • Add the new role.

Type BigQuery and select Admin.

26-IAM-create-role

  • The new role should be dispalyed in your list.

27-Role-created

  • Let's create a new key to access the BigQuery dataset from outside. Go to Service accounts and select Manage keys.

![28-Service accounts-permissions](assets/28-Service accounts-permissions.png)

  • Add a new key.

29-Service-accounts-key

  • Download the JSON file into your local folder.

30-Service-accounts-downloadJSON

  • Create a new Python script (I am using colab).

https://colab.research.google.com/drive/1br-6wQoanZGz_Zg0jayCOZI9_zthP3S9?usp=sharing

!pip install google-cloud-bigquery

from google.cloud import bigquery
from google.oauth2 import service_account

credentialsPath = r'class-380310-6fb7133281e5.json'

credentials = service_account.Credentials.from_service_account_file(credentialsPath)

client = bigquery.Client(credentials=credentials)

query = 'CALL class-380310.kaggle_data.movies'

results = client.query(query)

print(results)

query = client.query("""
  SELECT * FROM `class-380310.kaggle_data.movies` LIMIT 10
 """)

results = query.result()
for row in results:
	print(row)

Make sure you upload your key - or have your key in the same directory with your scirpts (if you run it locally)!

  • FInally run your Python scripts!

31-Run-python

Now delete everything you done!

  • Navigate to Buckets and delete the Bucket.
  • Navigate to BigQuery and delete your dataset.
  • Navigate to the IAM and delete your principal (use the pen to edit the principal and delete the role).
  • Navigate to the IAM Service accounts and delete your BigQuery service account.

Part 5: Access the data from Python

  • Create a new VM with more resources.

32-Create-standard-node

  • Change the boot disk to Ubuntu 24.04 LTS *

32-Create-standard-node-ubuntu

Connect to the VM using SSH

  1. Update your VM.
sudo apt-get update
  1. Install python.
sudo apt-get install python3-pip -y
  1. Download Ollama.
curl -fsSL https://ollama.com/install.sh | sh
  1. Download and run the model.
ollama run llama2
  • Ask a question!

How many Oscars has Sir Ian McKellen won?

You can exit using:

/bye
  1. Create a Modelfile, for example using pico.

Run the following command.

pico Modelfile
  1. Then enter the following text.
FROM llama2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system message
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""
  • Press together control and S to save.

  • Then, control and X to exit.

  1. Next, create and run the model:
ollama create mario -f ./Modelfile
  • Then run the next command.
ollama run mario
  • Now discuss with Mario!

Hi there!

You should reveive a response from Mario.

Press Control and C to break the chat at any time.

  1. Install Ollama library
pip install ollama
  1. Create a new Python script using pico.
pico test.py
  1. Paste the following code, and then run your code!
import ollama

stream = ollama.chat(
    model='llama2',
    messages=[{'role': 'user', 'content': 'Who is Bowser? Show me only 2-3 lines'}],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)
  • Or you can get a response from Mario, adjust your code to the mario model!
import ollama

stream = ollama.chat(
    model='mario',
    messages=[{'role': 'user', 'content': 'Who is Bowser? Show me only 2-3 lines'}],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)
  1. Run your script!
python3 test.py

Optional material to setup a UI:

Install Docker

apt install docker.io

Install the web-ui docker container.

docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Go to VM at port 3000

import ollama

stream = ollama.chat(
    model='llama2',
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)

About

A repository dedicates to students using cloud services to build their disserations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published