# R | DataFrames

DataFrames are a fundamental data structure in R that provides a tabular and spreadsheet-like representation of data. They are widely used for data manipulation, exploration, and analysis in R, and offer a powerful toolset for handling structured data effectively.

**Key Characteristics of DataFrames:**

1.  **Tabular Structure:** DataFrames are organized in a tabular format, similar to a spreadsheet, with rows representing individual observations or records, and columns representing different variables or attributes.
    
2.  **Heterogeneous Data:** DataFrames can hold data of various types within the same table. Each column in a DataFrame can have a different data type, making it ideal for handling mixed data, including numeric, character, logical, and factor variables.
    
3.  **Easy Data Import and Export:** R supports seamless importing of data from various sources, such as CSV files, Excel spreadsheets, databases, and APIs, into DataFrames. Likewise, exporting data from DataFrames to different file formats is straightforward.
    
4.  **Column Names and Row Indexing:** DataFrames have descriptive column names that facilitate easy referencing and intuitive manipulation of data. Additionally, you can set row names (row indexing) to provide meaningful labels for individual records.
    
5.  **Data Manipulation:** R offers an extensive set of functions and packages (e.g., dplyr, tidyr) designed explicitly for DataFrames, enabling users to efficiently perform filtering, grouping, reshaping, and other data manipulation tasks.
    
6.  **Integration with R Ecosystem:** DataFrames are seamlessly integrated with other R data structures, such as vectors and matrices. This integration enhances the compatibility and interoperability of DataFrames with existing R code.
    
7.  **Compatible with Statistical Models:** DataFrames are commonly used for statistical modeling and analysis in R. They provide a natural way to organize and analyze data for regression, classification, and other statistical modeling techniques.
    
8.  **Data Visualization:** R's visualization libraries, such as ggplot2, integrate smoothly with DataFrames, allowing for straightforward creation of insightful visualizations and graphs.
    

**Creating DataFrames:**

You can create a DataFrame from scratch using R's `data.frame()` function, which takes individual vectors or variables as input and combines them into a coherent table. Additionally, when importing data from external sources, R often automatically converts the imported data into a DataFrame.

**Example:**

    # Creating a simple DataFrame manually  
    name  <-  c("Alice",  "Bob",  "Charlie",  "David")  
    age  <-  c(25,  30,  22,  27)  
    city  <-  c("New York",  "San Francisco",  "Chicago",  "Los Angeles")  
    
    # Combining vectors into a DataFrame  
    df  <-  data.frame(Name  =  name,  Age  =  age,  City  =  city)
    
    # Displaying the DataFrame  
    print(df)
    

DataFrames play a central role in R data analysis workflows, and their versatility makes them a preferred choice for managing structured data and conducting sophisticated data analyses.

# Creating Data Frames

You can create a new data frame by passing vectors of the same length to the data.frame() function. The vectors you pass in become the columns of the data frame. The data you pass in can be named or unnamed.

In [1]:
 # Create some vectors
a <- c(1,2,3,4,5)                   
b <- c("R", "Is", "Fun!","Let's","Learn")
c <- c(TRUE,FALSE,TRUE,TRUE,FALSE)

# Create a new data frame
my_frame <- data.frame(a,b,c)

my_frame

a,b,c
<dbl>,<chr>,<lgl>
1,R,True
2,Is,False
3,Fun!,True
4,Let's,True
5,Learn,False


Since we did not supply column names, the columns took the names of the variables used to create the data frame. We could have assigned column names when creating the data frame.

In [2]:
my_frame <- data.frame(numeric = a, 
                       character = b, 
                       logical = c)

my_frame

numeric,character,logical
<dbl>,<chr>,<lgl>
1,R,True
2,Is,False
3,Fun!,True
4,Let's,True
5,Learn,False


You can check and reassign column names using the colnames() function.

In [3]:
colnames(my_frame)

colnames(my_frame) <- c("c1","c2","c3")

colnames(my_frame)

Data frames also support named rows. You can create row names when creating a data frame by including the row.names argument and setting it equal to a character vector to be used for row names.

In [4]:
my_frame <- data.frame(numeric = a, character = b, logical = c,
                      row.names = c("r1","r2","r3","r4","r5"))

my_frame

Unnamed: 0_level_0,numeric,character,logical
Unnamed: 0_level_1,<dbl>,<chr>,<lgl>
r1,1,R,True
r2,2,Is,False
r3,3,Fun!,True
r4,4,Let's,True
r5,5,Learn,False


You can check and alter row names after creating a data frame using the rownames() function.

In [5]:
rownames(my_frame)

rownames(my_frame) <- 1:5

rownames(my_frame)

It is worth mentioning that certain common data structures, such as tibbles, which are utilized in the popular tidyverse package for R (we will delve into this in future lessons), do not support row names. The rationale behind this limitation is explained in further detail [here](https://adv-r.hadley.nz/vectors-chap.html#rownames).

In real-world scenarios, the majority of the data frames you will work with are likely to come from external sources rather than being manually created. When importing data into R for analysis from tabular sources like Excel files or CSV files, it is typically structured as a data frame.

In [6]:
cars <- mtcars   # Load the mtcars data 

print(cars)

                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0   

# Summarizing Data Frames

When loading new data into R, it is essential to initially explore the dataset to understand its variables and values before delving into more in-depth analysis. Real-world data often comes with imperfections, such as oddly formatted values and missing (NA) values. Preparing the data for analysis, known as data munging or data wrangling, can be a time-consuming process. Data summaries play a crucial role in identifying areas that require cleaning.

Data frames in R offer various summary functions similar to matrices and lists. One such function is `str()`, which provides an overview of the data frame's structure. Checking the structure first is beneficial, especially when dealing with large datasets, as running a full summary can be time-consuming.

Instead of diving directly into detailed analysis, examining the structure of the data frame using `str()` allows you to gain insights into the organization of the data and helps determine the extent of cleaning or preprocessing required. Once you have a clear understanding of the data's structure, you can efficiently proceed with data cleaning and subsequent analysis.

In [7]:
str(cars)

'data.frame':	32 obs. of  11 variables:
 $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
 $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
 $ disp: num  160 160 108 258 360 ...
 $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
 $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
 $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
 $ qsec: num  16.5 17 18.6 19.4 17 ...
 $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
 $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
 $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
 $ carb: num  4 4 1 1 2 1 4 2 2 4 ...


The summary() function gives summary statistics for each variable in the data frame.

In [8]:
summary(cars)

      mpg             cyl             disp             hp       
 Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
 1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
 Median :19.20   Median :6.000   Median :196.3   Median :123.0  
 Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
 3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
 Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
      drat             wt             qsec             vs        
 Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
 1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
 Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
 Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
 3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
 Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
       am              gear            carb      
 Min.   :0.0000   Min.   :3.000  

If a data frame is large, you won't want to try to print the entire frame to the screen. You can look at a few rows at the beginning or end of a data frame using the head() and tail() functions respectively:

In [9]:
head(cars, 5)     # Look at the first 5 rows of the data frame

tail(cars, 5)     # Look at the last 5 rows of the data frame

Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Mazda RX4,21.0,6,160,110,3.9,2.62,16.46,0,1,4,4
Mazda RX4 Wag,21.0,6,160,110,3.9,2.875,17.02,0,1,4,4
Datsun 710,22.8,4,108,93,3.85,2.32,18.61,1,1,4,1
Hornet 4 Drive,21.4,6,258,110,3.08,3.215,19.44,1,0,3,1
Hornet Sportabout,18.7,8,360,175,3.15,3.44,17.02,0,0,3,2


Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Lotus Europa,30.4,4,95.1,113,3.77,1.513,16.9,1,1,5,2
Ford Pantera L,15.8,8,351.0,264,4.22,3.17,14.5,0,1,5,4
Ferrari Dino,19.7,6,145.0,175,3.62,2.77,15.5,0,1,5,6
Maserati Bora,15.0,8,301.0,335,3.54,3.57,14.6,0,1,5,8
Volvo 142E,21.4,4,121.0,109,4.11,2.78,18.6,1,1,4,2


Data frames support a few other basic summary operations.

In [10]:
dim(cars)      # Get the dimensions of the data frame

nrow(cars)     # Get the number of rows

ncol(cars)     # Get the number of columns

# Data Frame Indexing

Data frame indexing in R refers to the process of accessing specific rows or columns within a data frame. Data frames are tabular data structures in R, and indexing allows you to extract, modify, or analyze subsets of the data based on row and column specifications.

**Data Frame Indexing Methods:**

1.  **Using Row and Column Numbers:**
    
    -   To access specific elements, you can use row and column numbers inside square brackets `[]`.
    -   For example, `df[2, 3]` will retrieve the value at the second row and third column of the data frame `df`.
2.  **Using Row and Column Names:**
    
    -   You can also use row and column names to index elements within the data frame.
    -   For example, `df["Alice", "Age"]` will retrieve the value in the "Age" column for the row with the name "Alice" in the data frame `df`.
3.  **Using Logical Indexing:**
    
    -   Logical indexing involves using logical conditions to select specific rows or columns that meet specific criteria.
    -   For example, `df[df$Age > 25, ]` will return all rows where the "Age" column has values greater than 25.
4.  **Selecting Columns:**
    
    -   You can extract specific columns by using the `$` symbol or by using the column index.
    -   For example, `df$Name` will retrieve the "Name" column, and `df[, 2]` will return the second column of the data frame `df`.
5.  **Subset by Row and Column:**
    
    -   To select both rows and columns simultaneously, you can use a combination of row and column indexing.
    -   For example, `df[1:3, c("Name", "Age")]` will retrieve the "Name" and "Age" columns for the first three rows in the data frame `df`.

In [11]:
head( mtcars[6]  )  # Single brackets take column slices 

typeof( mtcars[6] ) # And return a new data frame

Unnamed: 0_level_0,wt
Unnamed: 0_level_1,<dbl>
Mazda RX4,2.62
Mazda RX4 Wag,2.875
Datsun 710,2.32
Hornet 4 Drive,3.215
Hornet Sportabout,3.44
Valiant,3.46


In [12]:
# Double brackets get the actual object at the index
head( mtcars[[6]]  )

typeof( mtcars[[6]]  )

In [13]:
# Column name notation in double brackets works
head( mtcars[["wt"]]  )  

# As does the $ notation
head( mtcars$wt  )       

Data frames also support matrix-like indexing by using a single square bracket with a comma separating the index value for the row and column. Matrix indexing allows you get values by row or specific values within the data frame.

In [14]:
# Get the value at row 2 column 6
cars[2,6]   

# Get the second row
cars[2, ]  

# Get the 6th column
cars[ ,6]  

# Get a row by using its name
cars["Mazda RX4", ]

 # Get a column by using its name
cars[ ,"mpg"]   

Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Mazda RX4 Wag,21,6,160,110,3.9,2.875,17.02,0,1,4,4


Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Mazda RX4,21,6,160,110,3.9,2.62,16.46,0,1,4,4


All of the indexing methods shown in previous lessons still apply, even logical indexing.

In [15]:
# Get rows where mpg is greater than 25
cars[(cars$mpg > 25), ]   

Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Fiat 128,32.4,4,78.7,66,4.08,2.2,19.47,1,1,4,1
Honda Civic,30.4,4,75.7,52,4.93,1.615,18.52,1,1,4,2
Toyota Corolla,33.9,4,71.1,65,4.22,1.835,19.9,1,1,4,1
Fiat X1-9,27.3,4,79.0,66,4.08,1.935,18.9,1,1,4,1
Porsche 914-2,26.0,4,120.3,91,4.43,2.14,16.7,0,1,5,2
Lotus Europa,30.4,4,95.1,113,3.77,1.513,16.9,1,1,5,2


Instead of logical indexing, you can also use the subset() function to create data frame subsets based on logical statements. subset() takes the data frame as the first argument and then a logical statement as the second argument create a subset.

In [16]:
# Subset with over 20 mpg and 70 horsepower
subset(cars, (mpg > 20) & (hp > 70))   

Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Mazda RX4,21.0,6,160.0,110,3.9,2.62,16.46,0,1,4,4
Mazda RX4 Wag,21.0,6,160.0,110,3.9,2.875,17.02,0,1,4,4
Datsun 710,22.8,4,108.0,93,3.85,2.32,18.61,1,1,4,1
Hornet 4 Drive,21.4,6,258.0,110,3.08,3.215,19.44,1,0,3,1
Merc 230,22.8,4,140.8,95,3.92,3.15,22.9,1,0,4,2
Toyota Corona,21.5,4,120.1,97,3.7,2.465,20.01,1,0,3,1
Porsche 914-2,26.0,4,120.3,91,4.43,2.14,16.7,0,1,5,2
Lotus Europa,30.4,4,95.1,113,3.77,1.513,16.9,1,1,5,2
Volvo 142E,21.4,4,121.0,109,4.11,2.78,18.6,1,1,4,2


The matrix functions cbind() and rbind() we covered in part 6 work on data frames, providing an easy way to combine two data frames with the same number of rows or columns.
You can also delete columns in a data frame by assigning them a value of NULL.

In [17]:
cars$vs <- NULL     # Drop the column "vs"

cars$carb <- NULL   # Drop the column "carb"

head(cars)

Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,am,gear
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Mazda RX4,21.0,6,160,110,3.9,2.62,16.46,1,4
Mazda RX4 Wag,21.0,6,160,110,3.9,2.875,17.02,1,4
Datsun 710,22.8,4,108,93,3.85,2.32,18.61,1,4
Hornet 4 Drive,21.4,6,258,110,3.08,3.215,19.44,0,3
Hornet Sportabout,18.7,8,360,175,3.15,3.44,17.02,0,3
Valiant,18.1,6,225,105,2.76,3.46,20.22,0,3


You cannot drop rows by assigning them a value of NULL due to the way data frames are stored as lists of columns. If you want to drop rows, you can use matrix-style subsetting with the -operator.

In [18]:
cars <- cars[-c(1, 3), ] # Drop rows 1 and 3

head( cars )             # Note Mazda RX4 and Datsun 710 have been removed

Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,am,gear
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Mazda RX4 Wag,21.0,6,160.0,110,3.9,2.875,17.02,1,4
Hornet 4 Drive,21.4,6,258.0,110,3.08,3.215,19.44,0,3
Hornet Sportabout,18.7,8,360.0,175,3.15,3.44,17.02,0,3
Valiant,18.1,6,225.0,105,2.76,3.46,20.22,0,3
Duster 360,14.3,8,360.0,245,3.21,3.57,15.84,0,3
Merc 240D,24.4,4,146.7,62,3.69,3.19,20.0,0,4


# Wrap Up

Data frames are one of the main reasons R is a good tool for working with data. Data in many common formats translate directly into R data frames and they are easy to summarize and subset.

# Exercises

To do the exercises, fill in and run the code boxes according to the exercise instructions.

It is common knowledge that cars with 4 cylinders tend to be less powerful but get better gas mileage than cars with 6 or 8 cylinders, but how much of an impact do additional cylinders have on horsepower? Are 8 cylinder cars twice as powerful as 4 cylinder cars? In these exercises we'll investigate this question.

### Exercise #1
Add a new column to cars called hp_per_cylinder equal to the hp column divided by the cyl column.

*Note: Recall that we can add new objects to a list with list$new_column_name <- column_values* <br>
*This syntax also works for data frames, since data frames are also lists*

In [21]:
cars <- mtcars

"Your code here"

summary(cars)

      mpg             cyl             disp             hp       
 Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
 1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
 Median :19.20   Median :6.000   Median :196.3   Median :123.0  
 Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
 3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
 Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
      drat             wt             qsec             vs        
 Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
 1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
 Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
 Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
 3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
 Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
       am              gear            carb      
 Min.   :0.0000   Min.   :3.000  

### Exercise #2
Create a new data frame called "high_power_cars" that is a subset of cars where hp_per_cylinder is greater than 30.

In [22]:
high_power_cars <- "Your Code here!"
high_power_cars

### Exercise #3
Notice that all the cars with high hp per cylinder have 8 cylinders. Repeat the previous exercise, but this time make a subset of low power cars with an hp_per_cylinder less than 17.

In [23]:
low_power_cars <- "Your Code here!"
low_power_cars

The results of the previous two exercises reveal that the cars with the highest power per cylinder all have 8 cylinders, while the cars with the lowest power per cylinder all have 4 cylinders. This result suggests that it may not simply be a matter of having more cylinders that makes 8 cylinder cars have higher horsepower than lower cylinder cars; perhaps 8 cylinder engines are designed to produce more power per cylinder.

## Exercise Solutions

In [24]:
# 1 
cars <- mtcars

cars$hp_per_cylinder <- cars$hp / cars$cyl

summary(cars)

# 2 

high_power_cars <- cars[cars$hp_per_cylinder > 30, ]
high_power_cars

# 3

low_power_cars <- cars[cars$hp_per_cylinder < 17, ]
low_power_cars

      mpg             cyl             disp             hp       
 Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
 1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
 Median :19.20   Median :6.000   Median :196.3   Median :123.0  
 Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
 3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
 Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
      drat             wt             qsec             vs        
 Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
 1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
 Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
 Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
 3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
 Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
       am              gear            carb       hp_per_cylinder
 Min.   :0.0000  

Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb,hp_per_cylinder
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Duster 360,14.3,8,360,245,3.21,3.57,15.84,0,0,3,4,30.625
Camaro Z28,13.3,8,350,245,3.73,3.84,15.41,0,0,3,4,30.625
Ford Pantera L,15.8,8,351,264,4.22,3.17,14.5,0,1,5,4,33.0
Maserati Bora,15.0,8,301,335,3.54,3.57,14.6,0,1,5,8,41.875


Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb,hp_per_cylinder
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Merc 240D,24.4,4,146.7,62,3.69,3.19,20.0,1,0,4,2,15.5
Fiat 128,32.4,4,78.7,66,4.08,2.2,19.47,1,1,4,1,16.5
Honda Civic,30.4,4,75.7,52,4.93,1.615,18.52,1,1,4,2,13.0
Toyota Corolla,33.9,4,71.1,65,4.22,1.835,19.9,1,1,4,1,16.25
Fiat X1-9,27.3,4,79.0,66,4.08,1.935,18.9,1,1,4,1,16.5
