environment setup:

pip install:

    notebook
    otter-grader
    datascience
    scipy
    pandas
    matplotlib
    ipywidgets

    optional:

        jupyterlab
install:

    OpenSSL 1.1.1 or higher

In [2]:
from datascience import *
from math import *
import numpy as np

import d8error

## Tables: 

>Table Operations:

<1> table.show(n)

    shows the first n rows of table
    !!not creating or mutating any table

>>Create new tables out of existing tables:

<2> table.select(columns)

    returns a new table out of selected columns
    order of columns of the new table is manageable
    columns could be either column labels or column indices

<3> table.drop(columns)

    inverse version of .select(
    creates a new table without selected columns

<4> table.where(column, condition)

    returns a new table with rows meeting the condition
    example: cones.where('Flavor', 'chocolate')

    condition can be a function!
    example: imdb.where('Year', lambda x: x > 2000)

some of the CONDITIOINS:

|Predicate|Example|Result|
|-|-|-|
|`are.equal_to`|`are.equal_to(50)`|Find rows with values equal to 50|
|`are.not_equal_to`|`are.not_equal_to(50)`|Find rows with values not equal to 50|
|`are.above`|`are.above(50)`|Find rows with values above (and not equal to) 50|
|`are.above_or_equal_to`|`are.above_or_equal_to(50)`|Find rows with values above 50 or equal to 50|
|`are.below`|`are.below(50)`|Find rows with values below 50|
|`are.between`|`are.between(2, 10)`|Find rows with values above or equal to 2 and below 10|
|`are.between_or_equal_to`|`are.between_or_equal_to(2, 10)`|Find rows with values above or equal to 2 and below or equal to 10|

<5> table.sort(column, (descending = True/False))

    returns a new table with order
    descending is optional

<6> table.take(row_indices)

    returns a new table containing selected rows
    example: table.take(0, 3, 1)
    
    also: 
    table.take(np.arange(...))

<7> table.column(column)

    returns an array of selected column
    args could be either column label or index

<8> table.relabeled(column, new_label)

    returns a new table with column label changed

>Table Properties:

<1> table.num_rows
    
    And:
    
    table.num_columns

<2> table.labels

    returns a list containing all labels of the columns

>Create A Table From Scratch:

<1> Table.read_table('path')

<2> with_column and with_columns methods

In [15]:
t = Table()

In [16]:
streets = make_array('Bancroft', 'Durant', 'Channing', 'Haste')
southside = t.with_column('Street name', streets)
southside

Street name
Bancroft
Durant
Channing
Haste


In [None]:
# with_column doesn't mutate the table
t

In [None]:
southside.with_column('City', 'Berkeley')

In [None]:
t.with_columns(
    'Street name', streets,
    'Blocks from campus', np.arange(4),
    'Time to get there', np.arange(1, 8, 2)
    )

## Numpy Arrays:

an array is a list of values of the same type

In [4]:
# make an array:
first_four = make_array(1, 2, 3, 4)
first_four

array([1, 2, 3, 4], dtype=int64)

In [None]:
type(first_four)

make an array using RANGES:

np.arange((start), end, (step))

In [None]:
np.arange(6)

In [None]:
np.arange(1, 11, 2)

In [None]:
np.arange(0, 1, 0.1)

## Array Operations: 

In [5]:
np.average(first_four)

2.5

In [6]:
np.sum(first_four)

10

In [8]:
# builtin fuctions works in many times
sum(first_four)

10

In [9]:
first_four.item(0)

1

In [11]:
# list operations
first_four[0]

1

In [23]:
# note that the data types are different
(type(first_four.item(0)), type(first_four[0]))

(int, numpy.int64)

In [13]:
[n + 1 for n in first_four]

[2, 3, 4, 5]

In [None]:
len(first_four)

>Arithmetic Operations: 

In [None]:
next_four = make_array(5, 6, 7, 8)
next_four

In [None]:
first_four + next_four

In [None]:
first_four * next_four

In [None]:
first_four * 4

In [None]:
first_four + 4 == next_four