# Visualization Curriculum

## Chapter4: Scales, Axes, and Legends

---
* Author:  [Yuttapong Mahasittiwat](mailto:khala1391@gmail.com)
* Technologist | Data Modeler | Data Analyst
* [YouTube](https://www.youtube.com/khala1391)
* [LinkedIn](https://www.linkedin.com/in/yuttapong-m/)
---

Source: [Visualization Curriculum](https://idl.uw.edu/visualization-curriculum/altair_introduction.html)

In [3]:
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import altair as alt
print("pandas version :",pd.__version__)
print("numpy version :",np.__version__)
print("matplotlib version :",mpl.__version__)
print("seaborn version :",sns.__version__)
print("altair version :",alt.__version__)

pandas version : 2.2.1
numpy version : 1.26.4
matplotlib version : 3.8.4
seaborn version : 0.13.2
altair version : 5.4.0


In [4]:
import warnings
warnings.filterwarnings('ignore', category=FutureWarning, message="the convert_dtype parameter is deprecated")

In [5]:
antibiotics = 'https://cdn.jsdelivr.net/npm/vega-datasets@1/data/burtin.json'
df = pd.read_json(antibiotics)

In [6]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16 entries, 0 to 15
Data columns (total 6 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Bacteria       16 non-null     object 
 1   Penicillin     16 non-null     float64
 2   Streptomycin   16 non-null     float64
 3   Neomycin       16 non-null     float64
 4   Gram_Staining  16 non-null     object 
 5   Genus          16 non-null     object 
dtypes: float64(3), object(3)
memory usage: 900.0+ bytes


In [7]:
alt.Chart(antibiotics).mark_circle().encode(
    alt.X('Neomycin:Q')
)

In [13]:
alt.Chart(antibiotics).mark_circle().encode(
    alt.X('Neomycin:Q',
          scale=alt.Scale(type='sqrt'))
)

- `linear`: For continuous numerical data.
- `log`: For data with wide ranges (logarithmic scale).
- `sqrt`: For data with square root transformation.
- `ordinal`: For ordered categorical data.
- `nominal`: For unordered categorical data.
- `point`: For evenly distributed categorical data points.
- `band`: For evenly spaced categorical data with adjustable spacing.

In [20]:
alt.Chart(antibiotics).mark_circle().encode(
    alt.X('Neomycin:Q',
          scale=alt.Scale(type='log'))
)

In [24]:
alt.Chart(antibiotics).mark_circle().encode(
    alt.X('Neomycin:Q',
          sort='descending',
          scale=alt.Scale(type='log'),
          title='Neomycin MIC (μg/ml, reverse log scale)')
)

In [26]:
alt.Chart(antibiotics).mark_circle().encode(
    alt.X('Neomycin:Q',
          sort='descending',
          scale=alt.Scale(type='log'),
          axis=alt.Axis(orient='top'),
          title='Neomycin MIC (μg/ml, reverse log scale)')
)

In [None]:
alt.Chart(antibiotics).mark_circle().encode(
    alt.X('Neomycin:Q',
          sort='descending',
          scale=alt.Scale(type='log'),
          title='Neomycin MIC (μg/ml, reverse log scale)'),
    alt.Y('Streptomycin:Q',
          sort='descending',
          scale=alt.Scale(type='log'),
          title='Streptomycin MIC (μg/ml, reverse log scale)')
)