### DAY 3 | Feb 1st, 2025
## Project 2 : Playing Cards
#### Design a deck of playing cards that you can shuffle and deal from

## Part 1: R Objects
- We will be using R to assemble a deck of 52 playing cards.
- We can load most data sets into R with one simple step. But here we will learn how R stores data, and how to assemble or disassemble own data sets. 
### 1.1 Atomic vectors
- It is a simple vector of data.
- can be made by grouping values. for eg:

In [None]:
die <- c(1,2,3,4,5,6)     #we used c here to group/combine values
die

In [None]:
is.vector(die)           #to check if its a vector

In [None]:
flowers <- 8             #vector of one value 
flowers
is.vector(flowers)
length(die)              #checking length 
length(flowers)

- Each atomic vector stores values as a one-dimensional vector, each atomic vector can only store one type of data.
- You can save different types of data in R by using different types of atomic vectors. types: 

### 1.1.1 Doubles
It stores regular numbers, can be positive or negative, large or small, and have digits to the right of the decimal place or not.

In [None]:
hi <- c(3,4,5) 
typeof(hi)      #used to check type of vector

### 5.1.2 Integers
It stores integers, numbers that can be written without a decimal component. 

In [None]:
litre <- c(-2L, -4L, 4L, 6L)      #integers can be created by typing a number followed by an uppercase L. 
litre                             #integers without L will be saved as numbers
typeof(litre)

#### Doubles vs Integers 
- Computers handle integers more precisely doubles. Decimal numbers sometimes can't be stored exactly and need to be rounded.
- For example, π or sqrt of 2 have endless decimal places, so computers approximate them.
- This rounding can cause tiny errors called floating-point errors. These errors are small, but lead to unexpected results.
- Using integers avoids these errors because they don't need rounding.
- However, most real-world math needs decimals, so data scientists still use doubles, knowing that small errors might happen but are usually not a big problem.









### 1.1.3 Characters
It stores small pieces of text. You can create a character vector by typing a character or string of characters surrounded by quotes

In [None]:
good <- c("bye",  "#123@")      #string can contain numbers or symbols also
good

typeof(good)
typeof("bye")
typeof("#123@")

### 1.1.4 Logicals
Logical vectors store TRUE and FALSE, R’s form of Boolean data. They are very helpful for doing things comparisons:

In [None]:
2 < 6
check <- c (TRUE, FALSE, FALSE)    #type TRUE or FALSE in capital letters (without quotation marks), R treats input as logical data.
check
typeof(check)
typeof(FALSE)

### 1.1.5 Complex 
Rarely used, stores complex numbers. 

In [None]:
com <- c(1 + 9i, 2 + 2i, 6 + 3i)       #add an imaginary term to a number with i
com
typeof(com)

### 1.1.6 Raw 
They store raw bytes of data. Rarely used

In [None]:
raw(2)               #Making raw vectors gets complicated, but you can make an empty raw vector of length n with raw(n)
raw
typeof(raw(2))

## 1.2 Attributes
- A piece of information attached to an atomic vector.
- Doesn't affect any of the values in the object.
- It is a convenient place to put information associated with an object. 

In [None]:
attributes(die)    #to see which attributes an object has. 
                   #it returns NULL if an object has no attributes. An atomic vector, like die, won’t have any attributes unless we give it.             

The most common attributes to give an atomic vector are names, dimensions (dim), and classes. 
### 1.2.1 Names:

In [None]:
#Assigning name attribute to die, by assigning a character vector to the output of names. 
names(die) <- c("one", "two", "three", "four", "five", "six")           #The vector should include one name for each element in die.

In [None]:
die         #displaying die

In [None]:
names(die)         #checking names of die
attributes(die)    #checking which attribute has been assigned to die

- names won’t affect the actual values of the vector.
- names aren't affected when we manipulate the values of the vector

In [None]:
die+10

In [None]:
names(die) <- c("ek", "do", "tin", "char", "panch", "chey" )   #changing the names by assigning a new set of labels to names.
die

In [None]:
names(die) <- NULL                      #To remove the names attribute, set it to NULL.
die

### 1.2.2 Dimensions 
- An atomic vector can be transformed into an n-dimensional array by giving it a dimensions attribute with dim.
- Set the dim attribute to a numeric vector of length n. R will reorganize the elements of the vector into n dimensions. 
- R will always use the first value in dim for the number of rows and the second value for the number of columns.

In [None]:
dim(die) <- c(2, 3)         #reorganzing die into 2x3 matrix (2 rows, 3 columns)
die

In [None]:
dim(die) <- c(3, 2)        #reorganizng into 3 × 2 matrix (3 rows, 2 columns):
die

In [None]:
dim(die) <-c(1,6)          #converting die into a horizontal array
die

## 1.3 Matrices
- Matrices store values in a two-dimensional array.
- First, give matrix an atomic vector to reorganize into a matrix. Then, define how many rows should be in the matrix by setting the nrow argument to a number. Matrix organizes the vector of values into a matrix with specified number of rows.

In [None]:
m1 <- matrix (die, nrow=2)
m1

In [None]:
m2 <- matrix (die, ncol=2)      #you can set the ncol argument, which tells how many columns to include in the matrix
m2

Matrix fills up the matrix column by column by default, but you can fill the matrix row by row if you include the argument byrow = TRUE

In [None]:
m3 <- matrix (die, nrow=2, byrow=TRUE )
m3

## 1.4 Arrays 
- Array function creates an n-dimensional array.
- It can be used to sort values into a cube of three dimensions or a hypercube in 4, 5, or n dimensions.
- It is not as customizable as matrix, does the same thing as setting the dim attribute.
- To use array, provide an atomic vector as the first argument, and a vector of dimensions as the second argument, now called dim.

In [None]:
arr <- array (c(11:14, 21:24, 31:34), dim=c(2,3,3))
arr                       #jupyter notebook DOES NOT represent in the correct dimensions. 

### Making a Matrix

In [None]:
hand1 <- c("ace", "king", "queen", "jack", "ten", "spades", "spades", 
  "spades", "spades", "spades")                                        #make a character vector with the 10 values
matrix(hand1, nrow=5)
matrix(hand1, ncol=2)
dim(hand1) <- c(5,2) 

In [None]:
#second method: #If character vector lists the cards in a slightly different order. 
#In this case, fill the matrix row by row instead of column by column, by using byrow=true 

hand2 <- c("ace", "spades", "king", "spades", "queen", "spades", "jack", 
  "spades", "ten", "spades")             

matrix(hand2, nrow = 5, byrow = TRUE)
matrix(hand2, ncol = 2, byrow = TRUE)

## 1.5 Classes 
changing the dimensions of object will not change the type of the object, but it will change the object’s class attribute

In [None]:
dim(die) <- c(2,3)
typeof(die)                  #Every element in the matrix is still a double, but the elements have been arranged into a new structure
class(die)             #R added a class attribute to die when we changed its dimensions. This class describes die’s new format

In [None]:
attributes(die)    #object’s class attribute will not appear when we run attributes, we specifically search for it using class.

In [None]:
class("bye")   #we can also apply "class" to objects that do not have a class attribute. class returns a value based on the object’s atomic type.

### 1.5.1 Date and time 
*class: "POSIXct" "POSIXt"*
- The time looks like a character string when we display it, but its data type is actually "double", and its class is "POSIXct" "POSIXt" (2 classes)
- POSIXct is a widely used framework for representing dates and times.
- In the POSIXct framework, each time is represented by the number of seconds that have passed between the time and 12:00 AM January 1st 1970

In [None]:
current <- Sys.time()               #displays the current time
current

In [None]:
typeof(current)     #data type is double
class(current)      #class is POSIXct & POSIXt

In [None]:
unclass(current)       #to view the time that has passed between now and 12 am Jan 1st 1970 
                      #the time above (current) occured (output) seconds after 12 am Jan 1st 1970 

In [None]:
mil <- 1000000       #checking the time 1 million seconds after 12 am, jan 1st, 1970
mil
 
class(mil) <- c("POSIXct", "POSIXt")
mil                                        #it was 12 jan


There are many different classes of data in R and its packages, wait to learn about a class until you encounter it. 
There is one class that should be learnt alongside the atomic data types. That class is factors.

### 1.5.2 FACTORS 
- Used to store categorical information, like ethnicity, complexion, gender.

In [None]:
gender <- factor(c("male", "female", "female", "male"))   #To make a factor, pass an atomic vector into the factor function

In [None]:
typeof(gender)       #R recodes data in the vector as integers and stors the results in an integer vector, so output of datatype will be integer

 - R adds, *levels* attribute to the integer, which contains a set of labels for displaying the factor values.
- *class* attribute, which contains the class.

In [None]:
attributes(gender)

In [None]:
unclass(gender)      #to see how data is stored
#it uses the levels attribute when it displays the factor. each 1 as female, first label in the levels vector. each 2 as male, the second label.

In [None]:
gender

Factors make it easy to put categorical variables into a statistical model because the variables are already coded as numbers. 
But factors can be confusing as they look like character strings but behave like integers.

R will often try to convert character strings to factors when you load and create data. Do not let R make factors until you ask for them. 

You can convert a factor to a character string with the as.character function. R will retain the display version of the factor, not the integers stored in memory:

In [None]:
as.character(gender)

In [None]:
#Make a virtual playing card by combining “ace,” “heart,” and 1 into a vector
card <- c("ace", "hearts", 1)
card

 - Each atomic vector can only store one type of data. As a result, R coerced all of the values to character strings.
 - If you try to put multiple types of data into a vector, R will convert the elements to a single type of data.

## 1.6 Coercion
R follows specific rules while coercing data types: 
- If a character string is present in an atomic vector, R will convert everything else in the vector to character strings.
- If a vector only contains logicals and numbers, R will convert the logicals to numbers, TRUE becomes 1, FALSE becomes 0. 

In [261]:
#R uses the same coercion rules when we do math with logical values
sum (c(TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,TRUE))    #this becomes [sum(s(1,1,0,1,0,0,1))] 

In [267]:
mean (c(TRUE,TRUE,FALSE,FALSE))     #mean calculates the proportion of TRUE's

In [269]:
as.character(1)        #using "as" function to convert datatypes
as.logical(1)
as.numeric(FALSE)

In some cases, using a single type of data is a huge advantage. Vectors, matrices, and arrays make it easy to do math on large sets of numbers as each value can be manipulated in the same way. These operations tend to be fast because the objects are simple to store in memory.


## 1.7 Lists 
- Lists are like atomic vectors, they group data into a one-dimensional set. 
- lists do not group together individual values, they group together R objects, such as atomic vectors and other lists.
  
-> Making a list that contains a numeric vector of length 6 in its first element, a character vector of length 3 in its second element, and a new list of length 2 in its third element. To do this, use the list function.

In [282]:
list1 <- list(5:10, "numbers" , list (TRUE, FALSE))
list1

#### Using list to make a card

In [285]:
card <- list("ace", "heart", "1")     #first and second element are character vectors, third is a numerical vector
card

List can be used to store a deck of playing cards by saving an indivual list of one card and saving a deck of cards as 52 sublists (one for each card)

## 1.8 Data Frame
- These are two dimensional lists, they provide an ideal way to store a deck of cards.
- They group vectors together into a two dimensional group. Each vector becomes a column in the table. Each column contains a different type of data, but within a column every cell contains same type of data, also every column must be the same length.

creating dataframe by hand requires laborous typing, it can be done as follows: 

In [292]:
df <- data.frame (face=c("king", "queen", "jack"), suit=c("heart", "heart" ,"heart"), value=c(13,12,11))
df                            #we can name the arguments according to our choice, arguments name the column in dataframe.(face,suit,value)

face,suit,value
king,heart,13
queen,heart,12
jack,heart,11


In [298]:
typeof(df)               #its data type is a list
class(df)                #its class is data frame 
attributes(df)           #this will print the attributes 

In [300]:
unclass(df)     #to see how values are stored

To create a data frame for whole deck of cards, we write 3 vectors with 52 elements. 

In [305]:
deck <- data.frame(
  face = c("king", "queen", "jack", "ten", "nine", "eight", "seven", "six",
    "five", "four", "three", "two", "ace", "king", "queen", "jack", "ten", 
    "nine", "eight", "seven", "six", "five", "four", "three", "two", "ace", 
    "king", "queen", "jack", "ten", "nine", "eight", "seven", "six", "five", 
    "four", "three", "two", "ace", "king", "queen", "jack", "ten", "nine", 
    "eight", "seven", "six", "five", "four", "three", "two", "ace"),  
  suit = c("spades", "spades", "spades", "spades", "spades", "spades", 
    "spades", "spades", "spades", "spades", "spades", "spades", "spades", 
    "clubs", "clubs", "clubs", "clubs", "clubs", "clubs", "clubs", "clubs", 
    "clubs", "clubs", "clubs", "clubs", "clubs", "diamonds", "diamonds", 
    "diamonds", "diamonds", "diamonds", "diamonds", "diamonds", "diamonds", 
    "diamonds", "diamonds", "diamonds", "diamonds", "diamonds", "hearts", 
    "hearts", "hearts", "hearts", "hearts", "hearts", "hearts", "hearts", 
    "hearts", "hearts", "hearts", "hearts", "hearts"), 
  value = c(13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 13, 12, 11, 10, 9, 8, 
    7, 6, 5, 4, 3, 2, 1, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 13, 12, 11, 
    10, 9, 8, 7, 6, 5, 4, 3, 2, 1)
)

Always avoid writing large data sets by hand, if possible. The most convenient approach is to load data. 
Here, I only typed all of it to learn. 

In [311]:
deck                            #the complete deck of cards

face,suit,value
king,spades,13
queen,spades,12
jack,spades,11
ten,spades,10
nine,spades,9
eight,spades,8
seven,spades,7
six,spades,6
five,spades,5
four,spades,4
