-
Notifications
You must be signed in to change notification settings - Fork 1
/
03_np_pd_ex.py
61 lines (48 loc) · 2.14 KB
/
03_np_pd_ex.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
##################################################################
####### Crash Course Review Exercises
#########################################################
#########################################################
# Complete the tasks numbered below.
# If you get stuck, check out the solution video!
#######################################################
#######
# TASK 1: Import pandas and numpy
######
import numpy as np
import pandas as pd
#######
# TASK 2: Set Numpy's random number generator seed to 101
######
np.random.seed(101)
#######
# TASK 3: Create a NumPy Matrix of 100 rows by 5 columns consisting of
# random integers from 1-100. (Keep in mind that the upper
# limit may be exclusive.)
######
mat = np.random.randint(1,101,(100,5))
#######
# TASK 4: Now use pd.DataFrame() to read in this numpy array as a dataframe.
# Simple pass in the numpy array into that function to get back a
# dataframe. Pandas will auto label the columns to 0-4
######
df = pd.DataFrame(data = mat)
#######
# TASK 5: Using your previously created DataFrame, use [df.columns = [...]]
# (https://stackoverflow.com/questions/11346283/renaming-columns-in-pandas)
# to rename the pandas columns to be ['f1','f2','f3','f4','label'].
######
df = pd.DataFrame(data = mat, columns=['f1','f2','f3', 'f4', 'label'])
df.columns = ['f1','f2','f3', 'f4', 'label']
#######
# TASK 6: Alright, all the other tasks were hopefully straightforward.
# This final task will allow you to quickly check to see if you are at
# the right level with pandas! Do the following:
# Create a dataframe with the columns ['A','B','C','D'] with each column
# having 50 rows of random numbers for data. The random numbers should be
# between 0 and 100. (Hint: Use numpy to create the numbers, then pass
# it in to pd.DataFrame(), check out the data= and index= parameters
# for that call.)
rnum = np.random.randint(0,100,(50,4))
rnum = np.random.randint(0,100,200).reshape(50,4)
df = pd.DataFrame(data = rnum, columns = ['A', 'B', 'C', 'D'])
print(df)