# Python Data Types

The two crucial steps on learning any programming language are:

    Understanding the data types that exist in the language
    Learning how we can manipulate, create, store, read, change or remove a data.

A **data type** is a **value** or a **class** in Python. It represents the kind of value that tells what operations can be performed on a particular data.
For example, you can think of a **data type** as a **value** or a **class** which represents all numbers and we can perform the valid operations on it only by knowing the data type.

Python has the following data types built-in by default, in these categories:

* Numeric Types: int, float, complex
* Boolean Type: bool
* Sequence Types: str, list, tuple, range
* Set Type: set
* Mapping Type: dict
* NoneType

## int

int stands for integer. Integers are just whole numbers: positive or negative or zero.

In [66]:
num1=4
type(num1)

int

In [7]:
n=0
type(n)

int

## float 

float stands for floating point number. Floating point numbers in Python are notable because they have a decimal point in them or use an exponential (e) to define the value.

In [5]:
num2=5.9
type(num2)

float

In [6]:
F=1.2e3 #1.2 times 10 to the power of 3
F

1200.0

In [115]:
type(F)

float

## complex

In [68]:
num3=5+7j #here j is square root of -1
type(num3)

complex

In [59]:
num3.real

5.0

In [60]:
num3.imag

7.0

#### Extras:

In [69]:
a=5.66
b=int(a)
b

5

In [70]:
type(b)

int

In [6]:
c=float(b)
c

5.0

In [71]:
k=7
i=complex(b,k)
i

(5+7j)

In [72]:
type(i)

complex

## bool

bool stands for boolean types.
Python comes with Booleans (with predefined True and False displays that are basically just the integers 1 and 0). 

In [1]:
b>k

NameError: name 'b' is not defined

In [74]:
flag=b<k
flag

True

In [75]:
type(flag)

bool

In [76]:
int(True)

1

In [77]:
int(False)

0

In [116]:
print(True+True)

2


## str

str stands for strings. Strings are used in Python to record text information, such as names. Strings in Python are actually a *sequence*, which basically means Python keeps track of every element in the string as a sequence.

In [18]:
name="Prajna"
type(name)

str

In [19]:
name[1]

'r'

In [21]:
name1="My name is Prajna"
name1[11:]

'Prajna'

In [22]:
name1[2]

' '

In [96]:
long_string='''
Treatment A
  (N=xxx)
'''
print(long_string)


Treatment A
  (N=xxx)



In [97]:
type(long_string)

str

#### Extras

#### Examples of some useful methods similar to SAS functions. 
Don't worry! We will learn a lot about methods gradually in this course

In [28]:
str1 = "this is string example....wow!!!"
str2 = "exam"

print(str1.index(str2)) #index in SAS
print(str1.split(" ")) #scan from SAS is similar
print(str1.upper()) #upcase function in SAS 

15
['this', 'is', 'string', 'example....wow!!!']
THIS IS STRING EXAMPLE....WOW!!!


In [37]:
STUDYID='ABC123'
SITEID='01'
SUBJID='001'
USUBJID=STUDYID+SITEID+SUBJID
USUBJID

'ABC12301001'

In [98]:
USUBJID=STUDYID + '-' + SITEID + '-' + SUBJID
USUBJID

'ABC123-01-001'

In [143]:
#Using the multiplication symbol to create repetition!
letter='z'
letter*10

'zzzzzzzzzz'

## list

Lists are the collections of items. 
Few features of lists are:
* Written within square brackets []
* Mutable, meaning that we can make changes in a list.
* Allows heterogeneous objects. 
* Duplicate objects are also allowed.
* Orders remain preserved.


In [2]:
li1=[1,2,3,4,5,5]
type(li1)


list

### Indexing, slicing and updating a list

In Python, we use brackets [] after an object to call its index. We should also note that indexing starts at 0 for Python.

In [82]:
li1[0:3] #indexing starts from 0 and ends with 3-1

[1, 2, 3]

In [117]:
li1[:3] #if nothing is there, Python automatically starts indexing from 0

[1, 2, 3]

In [121]:
li1[0: :4] #4 is the increamenting number

[1, 5]

In [122]:
li1[ : :4]

[1, 5]

In [79]:
li2=['Drug A', 'Drug B', 'Placebo']
li2[0] # Grabbing the element at index 0

'Drug A'

In [81]:
li3=[1,2,'a',True]
li3[3]=False #changing the element at index 3
li3

[1, 2, 'a', False]

## tuple

Tuples are like lists but unlike lists we cannot modify them i.e. they are **immutable**. You would use tuples to present things that shouldn't be changed, such as days of the week, or dates on a calendar.

The construction of a tuples use () with elements separated by commas.

In [84]:
t=(25,23,66,34)
type(t)

tuple

In [85]:
t[1]='z'

TypeError: 'tuple' object does not support item assignment

In [86]:
t[1]

23

In [87]:
23 in t

True

## range

Any sequences of values represent ranges.
The range() function returns a sequence of numbers, starting from 0 by default, and increments by 1 (by default), and stops before a specified number.

In [99]:
#range creates objects that we can iterate over
r1=range(10) #first element is automatically taken to be 0
r1

range(0, 10)

In [100]:
type(r1)

range

#### Extras

In [41]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [145]:
list(range(10,20))

[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

In [146]:
#all even number in 10 to 20
list(range(10,21,2))

[10, 12, 14, 16, 18, 20]

In [147]:
for num in range(10):
 print(num)

0
1
2
3
4
5
6
7
8
9


In [148]:
for num in range(10):
 print("Visit Day 95")

Visit Day 95
Visit Day 95
Visit Day 95
Visit Day 95
Visit Day 95
Visit Day 95
Visit Day 95
Visit Day 95
Visit Day 95
Visit Day 95


## set

Sets are unordered collections of unique objects. Few features of sets are:
* presented within curly brackets
* unordered
* Duplicates are not allowed
* allow heterogeneous objects

In [129]:
s={25, 23, 66, 34}
type(s)

set

In [123]:
s #python returns a set by arranging it in ascending manner

{23, 25, 34, 66}

In [134]:
s.add(100) #adding an element to the set
s

{23, 25, 34, 66, 100}

In [135]:
s1={25, 23, 66, 66, 34, "def", "abc"}
s1

{23, 25, 34, 66, 'abc', 'def'}

In [136]:
s1[1]

TypeError: 'set' object is not subscriptable

In [137]:
66 in s1

True

#### Extras

In [3]:
#find out the list of unique subjects
subject_list=[113,113,101,101,111,112]
list(set(subject_list))

[112, 113, 101, 111]

## dict

A dictionary is a group of unordered objects in the form of key-value pairs where the key is unique but same values may come against different keys.
A dictionary key can be almost any Python type, but are usually numbers or strings. Values, on the other hand, can be any arbitrary Python object.

Example situations:
USUBJID-RACE
SUBJID-SEX

In [154]:
sub_race={
    'ABC-001-01': 'Asian',
    'ABC-001-02': "Asian",
    'ABC-001-03': "Black"
}
type(sub_race)

dict

In [155]:
sub_race['ABC-001-01']

'Asian'

In [105]:
d2={
    'gender': [0,1,2],
    'Y': True
}
type(d2)

dict

In [106]:
d2['gender']

[0, 1, 2]

In [107]:
d2['gender'][1]

1

#### Extras

In [51]:
ex1=[
    {
    'gender': [0,1,2],
    'Y': True
    },
    {
    'gender': ['M', 'F', 'O'],
    'N': False
    }
]

In [52]:
type(ex1)

list

In [53]:
type(ex1[0])

dict

In [54]:
print(ex1[0]['gender'])

[0, 1, 2]


In [55]:
print(ex1[1]['gender'][2])

O


## None

It is the absence of value. 

In [108]:
data=None
print(data)

None


In [109]:
type(data)

NoneType