## DataFrame.info

`DataFrame.info(verbose=None, buf=None, max_cols=None, memory_usage=None, show_counts=None)`

Print a concise summary of a DataFrame.

This method prints information about a DataFrame including the index dtype and columns, non-null values, and memory usage.

### Parameters:
- **verbose**: `bool`, optional  
  Whether to print the full summary. By default, the setting in `pandas.options.display.max_info_columns` is followed.

- **buf**: writable buffer, defaults to `sys.stdout`  
  Where to send the output. By default, the output is printed to `sys.stdout`. Pass a writable buffer if you need to further process the output.

- **max_cols**: `int`, optional  
  When to switch from the verbose to the truncated output. If the DataFrame has more than `max_cols` columns, the truncated output is used. By default, the setting in `pandas.options.display.max_info_columns` is used.

- **memory_usage**: `bool`, `str`, optional  
  Specifies whether total memory usage of the DataFrame elements (including the index) should be displayed. By default, this follows the `pandas.options.display.memory_usage` setting.
  
  - `True` always shows memory usage.
  - `False` never shows memory usage.
  - A value of 'deep' is equivalent to “True with deep introspection”. Memory usage is shown in human-readable units (base-2 representation). Without deep introspection, a memory estimation is made based on column dtype and number of rows, assuming values consume the same memory amount for corresponding dtypes. With deep memory introspection, a real memory usage calculation is performed at the cost of computational resources. See the Frequently Asked Questions for more details.

- **show_counts**: `bool`, optional  
  Whether to show the non-null counts. By default, this is shown only if the DataFrame is smaller than `pandas.options.display.max_info_rows` and `pandas.options.display.max_info_columns`. A value of `True` always shows the counts, and `False` never shows the counts.

In [3]:
import pandas as pd

# Creating a sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [24, 27, 22, 32],
    'Occupation': ['Engineer', 'Doctor', None, 'Artist']
}

df = pd.DataFrame(data)

# Using the DataFrame.info method
print("Default info:")
df.info()



Default info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   Name        4 non-null      object
 1   Age         4 non-null      int64 
 2   Occupation  3 non-null      object
dtypes: int64(1), object(2)
memory usage: 228.0+ bytes


In [4]:
print("\nVerbose info:")
df.info(verbose=True)




Verbose info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   Name        4 non-null      object
 1   Age         4 non-null      int64 
 2   Occupation  3 non-null      object
dtypes: int64(1), object(2)
memory usage: 228.0+ bytes


In [5]:
print("\nInfo with memory usage:")
df.info(memory_usage='deep')



Info with memory usage:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   Name        4 non-null      object
 1   Age         4 non-null      int64 
 2   Occupation  3 non-null      object
dtypes: int64(1), object(2)
memory usage: 571.0 bytes


In [6]:

print("\nInfo with non-null counts always shown:")
df.info(show_counts=True)



Info with non-null counts always shown:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   Name        4 non-null      object
 1   Age         4 non-null      int64 
 2   Occupation  3 non-null      object
dtypes: int64(1), object(2)
memory usage: 228.0+ bytes


## DataFrame.describe

`DataFrame.describe(percentiles=None, include=None, exclude=None)`

Generate descriptive statistics.

Descriptive statistics include those that summarize the central tendency, dispersion, and shape of a dataset’s distribution, excluding NaN values.

Analyzes both numeric and object series, as well as DataFrame column sets of mixed data types. The output will vary depending on what is provided. Refer to the notes below for more detail.

### Parameters:

- **percentiles**: `list-like of numbers`, optional  
  The percentiles to include in the output. All should fall between 0 and 1. The default is `[.25, .5, .75]`, which returns the 25th, 50th, and 75th percentiles.

- **include**: `'all'`, `list-like of dtypes` or `None` (default), optional  
  A whitelist of data types to include in the result. Ignored for Series. Here are the options:
  - `'all'`: All columns of the input will be included in the output.
  - A `list-like of dtypes`: Limits the results to the provided data types. To limit the result to numeric types, submit `numpy.number`. To limit it instead to object columns, submit the data type `numpy.object`. Strings can also be used in the style of `select_dtypes` (e.g. `df.describe(include=['O'])`). To select pandas categorical columns, use `'category'`.
  - `None` (default): The result will include all numeric columns.

- **exclude**: `list-like of dtypes` or `None` (default), optional  
  A blacklist of data types to omit from the result. Ignored for Series. Here are the options:
  - A `list-like of dtypes`: Excludes the provided data types from the result. To exclude numeric types, submit `numpy.number`. To exclude object columns, submit the data type `numpy.object`. Strings can also be used in the style of `select_dtypes` (e.g. `df.describe(exclude=['O'])`). To exclude pandas categorical columns, use `'category'`.
  - `None` (default): The result will exclude nothing.

In [7]:
import pandas as pd
import numpy as np

# Creating a sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [24, 27, 22, 32],
    'Salary': [70000, 55000, 42000, 60000],
    'Occupation': ['Engineer', 'Doctor', 'Artist', 'Artist']
}

df = pd.DataFrame(data)

# Using the DataFrame.describe method with default settings
print("Default describe:")
print(df.describe())



Default describe:
             Age       Salary
count   4.000000      4.00000
mean   26.250000  56750.00000
std     4.349329  11644.02565
min    22.000000  42000.00000
25%    23.500000  51750.00000
50%    25.500000  57500.00000
75%    28.250000  62500.00000
max    32.000000  70000.00000


In [8]:

# Including non-numeric columns in the describe method
print("\nDescribe including all data types:")
print(df.describe(include='all'))




Describe including all data types:
         Name        Age       Salary Occupation
count       4   4.000000      4.00000          4
unique      4        NaN          NaN          3
top     Alice        NaN          NaN     Artist
freq        1        NaN          NaN          2
mean      NaN  26.250000  56750.00000        NaN
std       NaN   4.349329  11644.02565        NaN
min       NaN  22.000000  42000.00000        NaN
25%       NaN  23.500000  51750.00000        NaN
50%       NaN  25.500000  57500.00000        NaN
75%       NaN  28.250000  62500.00000        NaN
max       NaN  32.000000  70000.00000        NaN


In [9]:
# Specifying percentiles
print("\nDescribe with custom percentiles (10%, 50%, 90%):")
print(df.describe(percentiles=[.1, .5, .9]))




Describe with custom percentiles (10%, 50%, 90%):
             Age       Salary
count   4.000000      4.00000
mean   26.250000  56750.00000
std     4.349329  11644.02565
min    22.000000  42000.00000
10%    22.600000  45900.00000
50%    25.500000  57500.00000
90%    30.500000  67000.00000
max    32.000000  70000.00000


In [10]:
# Excluding numeric columns from the describe method
print("\nDescribe excluding numeric columns:")
print(df.describe(exclude=[np.number]))



Describe excluding numeric columns:
         Name Occupation
count       4          4
unique      4          3
top     Alice     Artist
freq        1          2


In [12]:

# Including only object columns (such as strings)
print("\nDescribe including only object columns:")
print(df.describe(include=[np.object_]))


Describe including only object columns:
         Name Occupation
count       4          4
unique      4          3
top     Alice     Artist
freq        1          2
