<h1> FitBit Analysis</h1>

The data contains the following columns:

1. date - 1st column (0th index), 
2. step_count - 2nd column (1th index), 
3. mood 
4. calories_burned, 
5. hours_of_sleep, 
6. activity_status,

date (96 days), step_count, mood, calories_burned, hours_of_sleep, activity_status 

<h2> Functions used </h2>

1. np.argmin() find index of the minimum value in an array
2. np.armax() find index of the maxu=imum value in an array
3. np.max() find max value in an array
4. np.min() find min value in an array
5. np.where(condition) gives the array where condition index is True #masking
6. np.average() find the average value of an array
7. np.unique() find uniques values in an array
8. np.unique(array, return_counts=True) find uniques values in an array along with the count of values

In [2]:
import numpy as np

## Agenda

#### * Slicing in 2d
#### * Fancy Indexing (masking)
#### * Fitness data analysis

#### * sort, arg_sort, arg_where, arg_max functions. . . .  "arg" ---- trying to find index (not value)

## Slicing:

In [3]:
a = [1,2,3,4,5,6]
a

[1, 2, 3, 4, 5, 6]

In [4]:
a[4:]

[5, 6]

In [5]:
### list inside a list

In [6]:
list_m = [[1, 2, 3],
          [1, 2, 45, 1, 54], 
          [78, 56, 42, 1, 2, 3, 4, 1]]

In [7]:
list_m[:2]

[[1, 2, 3], [1, 2, 45, 1, 54]]

## 2d : arrays (slice and index)

In [8]:
m1 = np.arange(12)
m1

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [9]:
m1.ndim

1

In [10]:
m1 = m1.reshape(3,4)
m1

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [11]:
m1[:2] 

# -- a) first 2 rows 
# -- b) first 2 columns


array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

## 2d arrays

m1[row, column]

In [12]:
m1[:2, :]

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

In [13]:
m1

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [14]:
m1.ndim

2

In [15]:
m1.shape

(3, 4)

In [16]:
## fetch first 2 columns === all rows and only 2 coulmns

In [17]:
# m1[rows, columns]

In [18]:
m1[:,:2 ]

array([[0, 1],
       [4, 5],
       [8, 9]])

In [19]:
m1[:, [0,3]]

array([[ 0,  3],
       [ 4,  7],
       [ 8, 11]])

#### stepsize

In [20]:
m2 = np.arange(1,31).reshape(3,10)

In [21]:
m2

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27, 28, 29, 30]])

In [22]:
m2.shape

(3, 10)

In [23]:
## first approach
m2[:, [1,3,5,7,9] ]

array([[ 2,  4,  6,  8, 10],
       [12, 14, 16, 18, 20],
       [22, 24, 26, 28, 30]])

In [24]:
## second approach : 

m2[:, 1: :2] ## for columns => start at index 1, go to all columns with stepsize = 2

array([[ 2,  4,  6,  8, 10],
       [12, 14, 16, 18, 20],
       [22, 24, 26, 28, 30]])

In [25]:
m2

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27, 28, 29, 30]])

In [26]:
## second approach : 

m2[:, 1:5] ## for columns => start at index 1, go to all columns till 5 , stepsize = 1

array([[ 2,  3,  4,  5],
       [12, 13, 14, 15],
       [22, 23, 24, 25]])

In [27]:
m2[:, 1:5:1]

array([[ 2,  3,  4,  5],
       [12, 13, 14, 15],
       [22, 23, 24, 25]])

In [28]:
m2

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27, 28, 29, 30]])

In [29]:
m2[:, 1::1] ## start is given, end is not given

array([[ 2,  3,  4,  5,  6,  7,  8,  9, 10],
       [12, 13, 14, 15, 16, 17, 18, 19, 20],
       [22, 23, 24, 25, 26, 27, 28, 29, 30]])

In [30]:
m2

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27, 28, 29, 30]])

In [31]:
m2[:, 1::2] ## start is given, end is not given

array([[ 2,  4,  6,  8, 10],
       [12, 14, 16, 18, 20],
       [22, 24, 26, 28, 30]])

### Fancy Indexing (masking)

In [32]:
a = np.array([1,2,3,4,5,6])

In [33]:
a

array([1, 2, 3, 4, 5, 6])

In [34]:
a % 3 == 0

array([False, False,  True, False, False,  True])

In [35]:
a[a % 3 == 0]

array([3, 6])

In [36]:
mask_divisible_by_3 = (a % 3 == 0)

In [37]:
a[mask_divisible_by_3]

array([3, 6])

### extend masking to 2d arrays

In [38]:
m1 = np.arange(0,12).reshape(3,4)
m1

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [39]:
m1.shape

(3, 4)

In [40]:
condition = m1 < 6 ## condition=== mask

In [41]:
condition.shape

(3, 4)

In [42]:
condition

array([[ True,  True,  True,  True],
       [ True,  True, False, False],
       [False, False, False, False]])

In [43]:
m1

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [44]:
## how many True's ?? = 6 == size

## what are the possible shape of 2d array for size = 6 ?
### -- 2,3
### -- 3,2
### -- 1,6
### -- 6,1

In [45]:
m1[m1<6] ## 2d or 1d ??? 
## 1 d array

array([0, 1, 2, 3, 4, 5])

In [46]:
m1[m1<6].ndim

1

In [47]:
### let's take extreme step 

In [48]:
m1

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [49]:
m1 > -1

array([[ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

In [50]:
m1[m1 > -1]

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [51]:
m1[m1 > -1].reshape(6,2)

array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11]])

In [52]:
### fetch index where the mask is True

In [53]:
m1

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [54]:
m1 % 2 == 0

array([[ True, False,  True, False],
       [ True, False,  True, False],
       [ True, False,  True, False]])

### indexes where the mask/condition is true 

In [55]:
m1

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [56]:
np.argwhere(m1 % 2 == 0) ### indexes where the mask/condition is true 

array([[0, 0],
       [0, 2],
       [1, 0],
       [1, 2],
       [2, 0],
       [2, 2]], dtype=int64)

In [57]:
m2 = np.arange(0,23,2)
m2

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22])

In [58]:
## divisible by 5
m2 % 5 == 0

array([ True, False, False, False, False,  True, False, False, False,
       False,  True, False])

In [59]:
np.argwhere(m2 % 5 == 0)

array([[ 0],
       [ 5],
       [10]], dtype=int64)

In [60]:
m2

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22])

In [61]:
## not divisible by 5
~(m2 % 5 == 0)

array([False,  True,  True,  True,  True, False,  True,  True,  True,
        True, False,  True])

In [62]:
m2[~(m2 % 5 == 0)]

array([ 2,  4,  6,  8, 12, 14, 16, 18, 22])

In [63]:
np.argwhere(~(m2 % 5 == 0))

array([[ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [11]], dtype=int64)

In [64]:
any_array = np.arange(1,50,3)

In [65]:
any_array

array([ 1,  4,  7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49])

In [66]:
any_array[2]

7

In [67]:
any_array[2:5]

array([ 7, 10, 13])

In [68]:
np.sum(any_array) ## function  

425

In [69]:
income = np.array([12, 34, 76, 89, 56, 52])
income

array([12, 34, 76, 89, 56, 52])

In [70]:
condition = income>50

In [71]:
condition

array([False, False,  True,  True,  True,  True])

In [72]:
income[condition]

array([76, 89, 56, 52])

In [73]:
expenses = np.array([67, 54, 12, 90, 42, 37])
expenses

array([67, 54, 12, 90, 42, 37])

In [74]:
expenses[condition] ## expense of people having income > 50

array([12, 90, 42, 37])

In [75]:
expenses[income>50]

array([12, 90, 42, 37])

## Use Case: Fitness data analysis

### Imagine you are a Data Scientist at Fitbit

You've been given a user data to analyse and find some insights which can be shown on the smart watch.

#### But why would we want to analyse the user data for desiging the watch?

These insights from the user data can help business make customer oriented decision for the product design.



#### Lets first look at the data we have gathered

Link: https://drive.google.com/file/d/1Uxwd4H-tfM64giRS1VExMpQXKtBBtuP0/view?usp=sharing

<img src='https://drive.google.com/uc?id=1Uxwd4H-tfM64giRS1VExMpQXKtBBtuP0'>


In [82]:
data = np.loadtxt(r'Downloads\fitbit.txt', dtype=str)


In [None]:
# data is 2d matrix

In [83]:
data.shape ## 96 days of data

(96, 6)

In [None]:
# date, step_count, mood, calories_burned, hours_of_sleep, activity_status

In [None]:
# data ## 
# date - 1st column (0th index), 
# step_count - 2nd column (1th index), 
# mood, 
# calories_burned, 
# hours_of_sleep, 
# activity_status,

### # date (96 days), step_count, mood, calories_burned, hours_of_sleep, activity_status 
## have same size ? 

In [None]:
# data

In [84]:
data ## 2d array

array([['06-10-2017', '5464', 'Neutral', '181', '5', 'Inactive'],
       ['07-10-2017', '6041', 'Sad', '197', '8', 'Inactive'],
       ['08-10-2017', '25', 'Sad', '0', '5', 'Inactive'],
       ['09-10-2017', '5461', 'Sad', '174', '4', 'Inactive'],
       ['10-10-2017', '6915', 'Neutral', '223', '5', 'Active'],
       ['11-10-2017', '4545', 'Sad', '149', '6', 'Inactive'],
       ['12-10-2017', '4340', 'Sad', '140', '6', 'Inactive'],
       ['13-10-2017', '1230', 'Sad', '38', '7', 'Inactive'],
       ['14-10-2017', '61', 'Sad', '1', '5', 'Inactive'],
       ['15-10-2017', '1258', 'Sad', '40', '6', 'Inactive'],
       ['16-10-2017', '3148', 'Sad', '101', '8', 'Inactive'],
       ['17-10-2017', '4687', 'Sad', '152', '5', 'Inactive'],
       ['18-10-2017', '4732', 'Happy', '150', '6', 'Active'],
       ['19-10-2017', '3519', 'Sad', '113', '7', 'Inactive'],
       ['20-10-2017', '1580', 'Sad', '49', '5', 'Inactive'],
       ['21-10-2017', '2822', 'Sad', '86', '6', 'Inactive'],
       ['22-10

In [None]:
data

In [85]:
date = data[:, 0] # date

step_count =   data[:, 1]

mood =  data[:, 2]

calories_burned =  data[:, 3]

hours_of_sleep =  data[:, 4]

activity_status =  data[:, 5]

In [86]:
date

array(['06-10-2017', '07-10-2017', '08-10-2017', '09-10-2017',
       '10-10-2017', '11-10-2017', '12-10-2017', '13-10-2017',
       '14-10-2017', '15-10-2017', '16-10-2017', '17-10-2017',
       '18-10-2017', '19-10-2017', '20-10-2017', '21-10-2017',
       '22-10-2017', '23-10-2017', '24-10-2017', '25-10-2017',
       '26-10-2017', '27-10-2017', '28-10-2017', '29-10-2017',
       '30-10-2017', '31-10-2017', '01-11-2017', '02-11-2017',
       '03-11-2017', '04-11-2017', '05-11-2017', '06-11-2017',
       '07-11-2017', '08-11-2017', '09-11-2017', '10-11-2017',
       '11-11-2017', '12-11-2017', '13-11-2017', '14-11-2017',
       '15-11-2017', '16-11-2017', '17-11-2017', '18-11-2017',
       '19-11-2017', '20-11-2017', '21-11-2017', '22-11-2017',
       '23-11-2017', '24-11-2017', '25-11-2017', '26-11-2017',
       '27-11-2017', '28-11-2017', '29-11-2017', '30-11-2017',
       '01-12-2017', '02-12-2017', '03-12-2017', '04-12-2017',
       '05-12-2017', '06-12-2017', '07-12-2017', '08-12

In [87]:
step_count

array(['5464', '6041', '25', '5461', '6915', '4545', '4340', '1230', '61',
       '1258', '3148', '4687', '4732', '3519', '1580', '2822', '181',
       '3158', '4383', '3881', '4037', '202', '292', '330', '2209',
       '4550', '4435', '4779', '1831', '2255', '539', '5464', '6041',
       '4068', '4683', '4033', '6314', '614', '3149', '4005', '4880',
       '4136', '705', '570', '269', '4275', '5999', '4421', '6930',
       '5195', '546', '493', '995', '1163', '6676', '3608', '774', '1421',
       '4064', '2725', '5934', '1867', '3721', '2374', '2909', '1648',
       '799', '7102', '3941', '7422', '437', '1231', '1696', '4921',
       '221', '6500', '3575', '4061', '651', '753', '518', '5537', '4108',
       '5376', '3066', '177', '36', '299', '1447', '2599', '702', '133',
       '153', '500', '2127', '2203'], dtype='<U10')

In [88]:
mood

array(['Neutral', 'Sad', 'Sad', 'Sad', 'Neutral', 'Sad', 'Sad', 'Sad',
       'Sad', 'Sad', 'Sad', 'Sad', 'Happy', 'Sad', 'Sad', 'Sad', 'Sad',
       'Neutral', 'Neutral', 'Neutral', 'Neutral', 'Neutral', 'Neutral',
       'Happy', 'Neutral', 'Happy', 'Happy', 'Happy', 'Happy', 'Happy',
       'Happy', 'Happy', 'Neutral', 'Happy', 'Happy', 'Happy', 'Happy',
       'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Neutral',
       'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Happy',
       'Happy', 'Happy', 'Neutral', 'Sad', 'Happy', 'Happy', 'Happy',
       'Happy', 'Happy', 'Happy', 'Happy', 'Sad', 'Neutral', 'Neutral',
       'Sad', 'Sad', 'Neutral', 'Neutral', 'Happy', 'Neutral', 'Neutral',
       'Sad', 'Neutral', 'Sad', 'Neutral', 'Neutral', 'Sad', 'Sad', 'Sad',
       'Sad', 'Happy', 'Neutral', 'Happy', 'Neutral', 'Sad', 'Sad', 'Sad',
       'Neutral', 'Neutral', 'Sad', 'Sad', 'Happy', 'Neutral', 'Neutral',
       'Happy'], dtype='<U10')

In [89]:
calories_burned

array(['181', '197', '0', '174', '223', '149', '140', '38', '1', '40',
       '101', '152', '150', '113', '49', '86', '6', '99', '143', '125',
       '129', '6', '9', '10', '72', '150', '141', '156', '57', '72', '17',
       '181', '197', '131', '154', '137', '193', '19', '101', '139',
       '164', '137', '22', '17', '9', '145', '192', '146', '234', '167',
       '16', '17', '32', '35', '220', '116', '23', '44', '131', '86',
       '194', '60', '121', '76', '93', '53', '25', '227', '125', '243',
       '14', '39', '55', '158', '7', '213', '116', '129', '21', '28',
       '16', '180', '138', '176', '99', '5', '1', '10', '47', '84', '23',
       '4', '0', '0', '0', '0'], dtype='<U10')

In [90]:
hours_of_sleep

array(['5', '8', '5', '4', '5', '6', '6', '7', '5', '6', '8', '5', '6',
       '7', '5', '6', '8', '5', '4', '5', '6', '8', '5', '6', '5', '8',
       '5', '4', '5', '4', '5', '4', '3', '2', '9', '5', '6', '4', '5',
       '8', '4', '5', '6', '5', '6', '5', '6', '5', '6', '5', '6', '7',
       '6', '7', '6', '5', '6', '7', '8', '8', '7', '8', '5', '4', '3',
       '3', '4', '5', '5', '5', '3', '4', '4', '5', '5', '5', '5', '5',
       '5', '4', '3', '4', '5', '5', '4', '5', '3', '3', '3', '2', '3',
       '2', '8', '5', '5', '5'], dtype='<U10')

In [91]:
activity_status

array(['Inactive', 'Inactive', 'Inactive', 'Inactive', 'Active',
       'Inactive', 'Inactive', 'Inactive', 'Inactive', 'Inactive',
       'Inactive', 'Inactive', 'Active', 'Inactive', 'Inactive',
       'Inactive', 'Inactive', 'Inactive', 'Inactive', 'Inactive',
       'Inactive', 'Inactive', 'Inactive', 'Inactive', 'Inactive',
       'Active', 'Inactive', 'Inactive', 'Inactive', 'Inactive', 'Active',
       'Inactive', 'Inactive', 'Inactive', 'Inactive', 'Inactive',
       'Active', 'Active', 'Active', 'Active', 'Active', 'Active',
       'Active', 'Active', 'Active', 'Inactive', 'Inactive', 'Inactive',
       'Inactive', 'Inactive', 'Inactive', 'Active', 'Active', 'Active',
       'Active', 'Active', 'Active', 'Active', 'Active', 'Active',
       'Active', 'Active', 'Active', 'Inactive', 'Active', 'Active',
       'Inactive', 'Active', 'Active', 'Active', 'Active', 'Active',
       'Inactive', 'Active', 'Active', 'Active', 'Active', 'Inactive',
       'Inactive', 'Inactive', 'Inacti

In [92]:
step_count = np.array(step_count, dtype = 'int')
step_count

array([5464, 6041,   25, 5461, 6915, 4545, 4340, 1230,   61, 1258, 3148,
       4687, 4732, 3519, 1580, 2822,  181, 3158, 4383, 3881, 4037,  202,
        292,  330, 2209, 4550, 4435, 4779, 1831, 2255,  539, 5464, 6041,
       4068, 4683, 4033, 6314,  614, 3149, 4005, 4880, 4136,  705,  570,
        269, 4275, 5999, 4421, 6930, 5195,  546,  493,  995, 1163, 6676,
       3608,  774, 1421, 4064, 2725, 5934, 1867, 3721, 2374, 2909, 1648,
        799, 7102, 3941, 7422,  437, 1231, 1696, 4921,  221, 6500, 3575,
       4061,  651,  753,  518, 5537, 4108, 5376, 3066,  177,   36,  299,
       1447, 2599,  702,  133,  153,  500, 2127, 2203])

In [93]:
calories_burned = np.array(calories_burned, dtype = 'int')
calories_burned

array([181, 197,   0, 174, 223, 149, 140,  38,   1,  40, 101, 152, 150,
       113,  49,  86,   6,  99, 143, 125, 129,   6,   9,  10,  72, 150,
       141, 156,  57,  72,  17, 181, 197, 131, 154, 137, 193,  19, 101,
       139, 164, 137,  22,  17,   9, 145, 192, 146, 234, 167,  16,  17,
        32,  35, 220, 116,  23,  44, 131,  86, 194,  60, 121,  76,  93,
        53,  25, 227, 125, 243,  14,  39,  55, 158,   7, 213, 116, 129,
        21,  28,  16, 180, 138, 176,  99,   5,   1,  10,  47,  84,  23,
         4,   0,   0,   0,   0])

In [94]:
hours_of_sleep = np.array(hours_of_sleep, dtype = 'int')
hours_of_sleep

array([5, 8, 5, 4, 5, 6, 6, 7, 5, 6, 8, 5, 6, 7, 5, 6, 8, 5, 4, 5, 6, 8,
       5, 6, 5, 8, 5, 4, 5, 4, 5, 4, 3, 2, 9, 5, 6, 4, 5, 8, 4, 5, 6, 5,
       6, 5, 6, 5, 6, 5, 6, 7, 6, 7, 6, 5, 6, 7, 8, 8, 7, 8, 5, 4, 3, 3,
       4, 5, 5, 5, 3, 4, 4, 5, 5, 5, 5, 5, 5, 4, 3, 4, 5, 5, 4, 5, 3, 3,
       3, 2, 3, 2, 8, 5, 5, 5])

##### Is there relationship bw Stepcounts, Calories, mood?

In [None]:
## 96 days 

In [95]:
step_count

array([5464, 6041,   25, 5461, 6915, 4545, 4340, 1230,   61, 1258, 3148,
       4687, 4732, 3519, 1580, 2822,  181, 3158, 4383, 3881, 4037,  202,
        292,  330, 2209, 4550, 4435, 4779, 1831, 2255,  539, 5464, 6041,
       4068, 4683, 4033, 6314,  614, 3149, 4005, 4880, 4136,  705,  570,
        269, 4275, 5999, 4421, 6930, 5195,  546,  493,  995, 1163, 6676,
       3608,  774, 1421, 4064, 2725, 5934, 1867, 3721, 2374, 2909, 1648,
        799, 7102, 3941, 7422,  437, 1231, 1696, 4921,  221, 6500, 3575,
       4061,  651,  753,  518, 5537, 4108, 5376, 3066,  177,   36,  299,
       1447, 2599,  702,  133,  153,  500, 2127, 2203])

In [96]:
np.mean(step_count)

2935.9375

#### hypothesis/supicion : more steps count, more calories

#### 1.  On which day the step count was highest and the calories burned?


In [97]:
np.max(step_count)

7422

In [98]:
step_count

array([5464, 6041,   25, 5461, 6915, 4545, 4340, 1230,   61, 1258, 3148,
       4687, 4732, 3519, 1580, 2822,  181, 3158, 4383, 3881, 4037,  202,
        292,  330, 2209, 4550, 4435, 4779, 1831, 2255,  539, 5464, 6041,
       4068, 4683, 4033, 6314,  614, 3149, 4005, 4880, 4136,  705,  570,
        269, 4275, 5999, 4421, 6930, 5195,  546,  493,  995, 1163, 6676,
       3608,  774, 1421, 4064, 2725, 5934, 1867, 3721, 2374, 2909, 1648,
        799, 7102, 3941, 7422,  437, 1231, 1696, 4921,  221, 6500, 3575,
       4061,  651,  753,  518, 5537, 4108, 5376, 3066,  177,   36,  299,
       1447, 2599,  702,  133,  153,  500, 2127, 2203])

In [99]:
np.argmax(step_count) ## index of the maximum value

69

In [100]:
date[69]

'14-12-2017'

In [101]:
date[np.argmax(step_count)]

'14-12-2017'

In [102]:
calories_burned[np.argmax(step_count)] ## 

243

In [103]:
np.min(step_count)

25

In [104]:
np.argmin(step_count)

2

In [105]:
step_count

array([5464, 6041,   25, 5461, 6915, 4545, 4340, 1230,   61, 1258, 3148,
       4687, 4732, 3519, 1580, 2822,  181, 3158, 4383, 3881, 4037,  202,
        292,  330, 2209, 4550, 4435, 4779, 1831, 2255,  539, 5464, 6041,
       4068, 4683, 4033, 6314,  614, 3149, 4005, 4880, 4136,  705,  570,
        269, 4275, 5999, 4421, 6930, 5195,  546,  493,  995, 1163, 6676,
       3608,  774, 1421, 4064, 2725, 5934, 1867, 3721, 2374, 2909, 1648,
        799, 7102, 3941, 7422,  437, 1231, 1696, 4921,  221, 6500, 3575,
       4061,  651,  753,  518, 5537, 4108, 5376, 3066,  177,   36,  299,
       1447, 2599,  702,  133,  153,  500, 2127, 2203])

In [106]:
calories_burned[np.argmin(step_count)] 

0

### 2. step counts vs mood

In [107]:
mood 

array(['Neutral', 'Sad', 'Sad', 'Sad', 'Neutral', 'Sad', 'Sad', 'Sad',
       'Sad', 'Sad', 'Sad', 'Sad', 'Happy', 'Sad', 'Sad', 'Sad', 'Sad',
       'Neutral', 'Neutral', 'Neutral', 'Neutral', 'Neutral', 'Neutral',
       'Happy', 'Neutral', 'Happy', 'Happy', 'Happy', 'Happy', 'Happy',
       'Happy', 'Happy', 'Neutral', 'Happy', 'Happy', 'Happy', 'Happy',
       'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Neutral',
       'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Happy',
       'Happy', 'Happy', 'Neutral', 'Sad', 'Happy', 'Happy', 'Happy',
       'Happy', 'Happy', 'Happy', 'Happy', 'Sad', 'Neutral', 'Neutral',
       'Sad', 'Sad', 'Neutral', 'Neutral', 'Happy', 'Neutral', 'Neutral',
       'Sad', 'Neutral', 'Sad', 'Neutral', 'Neutral', 'Sad', 'Sad', 'Sad',
       'Sad', 'Happy', 'Neutral', 'Happy', 'Neutral', 'Sad', 'Sad', 'Sad',
       'Neutral', 'Neutral', 'Sad', 'Sad', 'Happy', 'Neutral', 'Neutral',
       'Happy'], dtype='<U10')

In [108]:
np.unique(mood)

array(['Happy', 'Neutral', 'Sad'], dtype='<U10')

In [109]:
np.average(step_count)

2935.9375

In [None]:
## if happy => more step count than avg
## if sad => less step count than avg

In [110]:
np.mean(step_count[mood=="Sad"])

2103.0689655172414

In [111]:
np.mean(step_count[mood=="Happy"])

3392.725

### 3. ### what is my usual mood when step count > 4000 : how many times

In [112]:
mood[step_count>4000]

array(['Neutral', 'Sad', 'Sad', 'Neutral', 'Sad', 'Sad', 'Sad', 'Happy',
       'Neutral', 'Neutral', 'Happy', 'Happy', 'Happy', 'Happy',
       'Neutral', 'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Happy',
       'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Sad',
       'Happy', 'Happy', 'Neutral', 'Happy', 'Neutral', 'Neutral', 'Sad',
       'Happy', 'Neutral', 'Happy'], dtype='<U10')

In [113]:
mood[step_count>4000].size

38

In [114]:
np.unique(mood[step_count>4000], return_counts= True)

(array(['Happy', 'Neutral', 'Sad'], dtype='<U10'),
 array([22,  9,  7], dtype=int64))

### 4. ### what is my usual mood when step count < 2000 : how many times

In [115]:
mood[step_count<2000]

array(['Sad', 'Sad', 'Sad', 'Sad', 'Sad', 'Sad', 'Neutral', 'Neutral',
       'Happy', 'Happy', 'Happy', 'Happy', 'Happy', 'Neutral', 'Happy',
       'Happy', 'Happy', 'Happy', 'Neutral', 'Happy', 'Happy', 'Happy',
       'Sad', 'Sad', 'Neutral', 'Neutral', 'Sad', 'Sad', 'Sad', 'Sad',
       'Sad', 'Sad', 'Sad', 'Sad', 'Neutral', 'Sad', 'Sad', 'Happy',
       'Neutral'], dtype='<U10')

In [116]:
mood[step_count<2000].size

39

In [117]:
np.unique(mood[step_count<2000] , return_counts= True)

(array(['Happy', 'Neutral', 'Sad'], dtype='<U10'),
 array([13,  8, 18], dtype=int64))

In [None]:
mood_step_count_less_than_2000 = mood[step_count<2000]

In [None]:
mood_step_count_less_than_2000

In [None]:
(mood_step_count_less_than_2000 == "Happy").sum()

In [None]:
(mood_step_count_less_than_2000 == "Neutral").sum()

In [None]:
(mood_step_count_less_than_2000 == "Sad").sum()

In [None]:
np.unique(mood[step_count<2000])

In [None]:
# >>> Home work questions:

#  hours_of_sleep, activity_status : formulate questions 

## 

In [None]:
# value_counts()

In [None]:
a = [1,3]
b = [3,4]

# a dot b

a[0]*b[0] +a[1]*b[1]


In [None]:
np.dot(a,b)

In [None]:
data[ : , 0]  ### get me only one column  by indexing 

In [None]:
# Hi Shivam, if we are slicing like 

data[ : , 0:1]  #getting a 2D array   

In [None]:
# how to do this 
a=[[1,2],[3,4]]
b=[[1],[2]]

In [None]:
a

In [None]:
b