## Lesson 1 Getting started with SQL and bigQuery

In [1]:
%load_ext watermark
%watermark -d -u -a "Duzhe Wang" -v 

Duzhe Wang 
last updated: 2020-09-05 

CPython 3.8.1
IPython 7.12.0


First, check if we have installed the ```google-cloud-bigquery``` module. 

In [2]:
!pip list|grep google-cloud-bigquery

google-cloud-bigquery    1.27.2


### The first step: create a ```Client``` object

In [3]:
from google.cloud import bigquery

In [4]:
client=bigquery.Client()

**Next**: construct a reference to the dataset with the ```dataset()``` method. Then use the ```get_dataset()``` method, along with the reference, to fetch the dataset.

In [6]:
# construct a reference to the "hacker_news" dataset
dataset_ref=client.dataset("hacker_news", project="bigquery-public-data")

In [7]:
# API request - fectch the dataset
dataset=client.get_dataset(dataset_ref)

In [8]:
# List all tables in the "hacker_news" dataset
tables=list(client.list_tables(dataset))

In [9]:
tables

[<google.cloud.bigquery.table.TableListItem at 0x10c1a8d60>,
 <google.cloud.bigquery.table.TableListItem at 0x1153c4eb0>,
 <google.cloud.bigquery.table.TableListItem at 0x1153c4fd0>,
 <google.cloud.bigquery.table.TableListItem at 0x1153c4160>]

In [10]:
# print names of all tables in the dataset
for table in tables:
    print(table.table_id)

comments
full
full_201510
stories


**Next**: fetch a table

In [11]:
table_ref=dataset_ref.table("full")

In [12]:
table=client.get_table(table_ref)

In [13]:
table

Table(TableReference(DatasetReference('bigquery-public-data', 'hacker_news'), 'full'))

### Table schema: the structure of a table is called its schema

In [14]:
table.schema

[SchemaField('title', 'STRING', 'NULLABLE', 'Story title', (), None),
 SchemaField('url', 'STRING', 'NULLABLE', 'Story url', (), None),
 SchemaField('text', 'STRING', 'NULLABLE', 'Story or comment text', (), None),
 SchemaField('dead', 'BOOLEAN', 'NULLABLE', 'Is dead?', (), None),
 SchemaField('by', 'STRING', 'NULLABLE', "The username of the item's author.", (), None),
 SchemaField('score', 'INTEGER', 'NULLABLE', 'Story score', (), None),
 SchemaField('time', 'INTEGER', 'NULLABLE', 'Unix time', (), None),
 SchemaField('timestamp', 'TIMESTAMP', 'NULLABLE', 'Timestamp for the unix time', (), None),
 SchemaField('type', 'STRING', 'NULLABLE', 'Type of details (comment, comment_ranking, poll, story, job, pollopt)', (), None),
 SchemaField('id', 'INTEGER', 'NULLABLE', "The item's unique id.", (), None),
 SchemaField('parent', 'INTEGER', 'NULLABLE', 'Parent comment ID', (), None),
 SchemaField('descendants', 'INTEGER', 'NULLABLE', 'Number of story or poll descendants', (), None),
 SchemaField

In [15]:
client.list_rows(table, max_results=5).to_dataframe()

  client.list_rows(table, max_results=5).to_dataframe()
  client.list_rows(table, max_results=5).to_dataframe()


Unnamed: 0,title,url,text,dead,by,score,time,timestamp,type,id,parent,descendants,ranking,deleted
0,,,The justification for going to go given here i...,,The_rationalist,,1590055883,2020-05-21 10:11:23+00:00,comment,23256678,23254045,,,
1,,,He&#x27;s still doing awesome stuff. See his b...,,davepeck,,1475590646,2016-10-04 14:17:26+00:00,comment,12635673,12633516,,,
2,,,"It is hard to predict the future, but despite ...",,soneca,,1587082178,2020-04-17 00:09:38+00:00,comment,22894933,22894608,,,
3,,,"Never take &quot;realtime&quot; seriously, unl...",,tormeh,,1440356944,2015-08-23 19:09:04+00:00,comment,10106423,10105793,,,
4,,,I changed from LastPass to Bitwarden. Have bee...,,shelune,,1570456305,2019-10-07 13:51:45+00:00,comment,21180749,21175332,,,


In [16]:
client.list_rows(table, selected_fields=table.schema[:1], max_results=5).to_dataframe()

  client.list_rows(table, selected_fields=table.schema[:1], max_results=5).to_dataframe()
  client.list_rows(table, selected_fields=table.schema[:1], max_results=5).to_dataframe()


Unnamed: 0,title
0,
1,
2,
3,
4,
