<a href="https://colab.research.google.com/github/christianhelle/autofaker/blob/main/docs/cheatsheet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Install AutoFaker from PyPI

In [1]:
pip install autofaker

Collecting autofaker
  Downloading autofaker-0.1.1-py3-none-any.whl (10 kB)
Collecting pyspark
  Downloading pyspark-3.1.2.tar.gz (212.4 MB)
[K     |████████████████████████████████| 212.4 MB 65 kB/s 
[?25hCollecting faker
  Downloading Faker-9.3.1-py3-none-any.whl (1.2 MB)
[K     |████████████████████████████████| 1.2 MB 46.8 MB/s 
Collecting py4j==0.10.9
  Downloading py4j-0.10.9-py2.py3-none-any.whl (198 kB)
[K     |████████████████████████████████| 198 kB 65.1 MB/s 
[?25hBuilding wheels for collected packages: pyspark
  Building wheel for pyspark (setup.py) ... [?25l[?25hdone
  Created wheel for pyspark: filename=pyspark-3.1.2-py2.py3-none-any.whl size=212880768 sha256=d943220856e55a60c2dfd8d7a17b2b8b526604f85509443b80378b0fa373d6bd
  Stored in directory: /root/.cache/pip/wheels/a5/0a/c1/9561f6fecb759579a7d863dcd846daaa95f598744e71b02c77
Successfully built pyspark
Installing collected packages: py4j, pyspark, faker, autofaker
Successfully installed autofaker-0.1.1 faker-9.3.

Import modules

In [2]:
from autofaker import Autodata

Creates an anonymous variable of the built-in types and dates

In [3]:
print(f'anonymous string:    {Autodata.create(str)}')
print(f'anonymous int:       {Autodata.create(int)}')
print(f'anonymous float:     {Autodata.create(float)}')

import datetime
print(f'anonymous datetime:  {Autodata.create(datetime)}')
print(f'anonymous date:      {Autodata.create(datetime.date)}')

anonymous string:    c9f61e41-46e4-4bfe-ac22-fe58e6a9aa5f
anonymous int:       9801
anonymous float:     6913.594700054302
anonymous datetime:  2025-08-12 02:16:25.000021
anonymous date:      2014-02-12 00:00:00


Create collections containing anonymous variables of built-in types and dates

In [4]:
print(f'anonymous strings:    {Autodata.create_many(str)}')
print(f'anonymous ints:       {Autodata.create_many(int, 10)}')
print(f'anonymous floats:     {Autodata.create_many(float, 5)}')

import datetime
print(f'anonymous datetime:   {Autodata.create_many(datetime)}')
print(f'anonymous date:       {Autodata.create_many(datetime.date)}')

anonymous strings:    ['2c1b6847-85e9-444f-97d2-b2330092e43d', '0717051e-aa9a-48fb-8840-5320e8dc3f27', 'b3828833-03f6-402f-83a4-aa12f8b0f8c7']
anonymous ints:       [8534, 3943, 393, 617, 7271, 2301, 880, 9059, 3727, 6034]
anonymous floats:     [9039.43798962022, 3270.411738870235, 9319.891162871703, 3843.2526976276995, 6174.879493069972]
anonymous datetime:   [datetime.datetime(2025, 5, 10, 14, 21, 26, 143), datetime.datetime(2017, 5, 10, 19, 28, 53, 323), datetime.datetime(2022, 9, 12, 22, 37, 39, 555)]
anonymous date:       [datetime.datetime(2015, 12, 12, 0, 0), datetime.datetime(2028, 6, 16, 0, 0), datetime.datetime(2015, 2, 1, 0, 0)]


Creates an anonymous class

In [5]:
class SimpleClass:
    id = 0
    text = 'test'

cls = Autodata.create(SimpleClass)
print(f'id = {cls.id}')
print(f'text = {cls.text}')

id = 7452
text = a610d4dc-5a2d-4c4d-8843-0fdbd91ef6ac


Creates an anonymous dataclass

In [6]:
from dataclasses import dataclass

@dataclass
class DataClass:
    id: int
    text: str

cls = Autodata.create(DataClass)
print(f'id = {cls.id}')
print(f'text = {cls.text}')

id = 2133
text = 8c51f2de-da29-44ab-8a57-953480df6dd0


Creates an anonymous dataclass using fake data

In [7]:
@dataclass
class DataClass:
    id: int

    name: str
    address: str
    job: str

    country: str
    currency_name: str
    currency_code: str

    email: str
    safe_email: str
    company_email: str

    hostname: str
    ipv4: str
    ipv6: str

    text: str


data = Autodata.create(DataClass, use_fake_data=True)

print(f'id:               {data.id}')
print(f'name:             {data.name}')
print(f'job:              {data.job}\n')
print(f'address:\n{data.address}\n')

print(f'country:          {data.country}')
print(f'currency name:    {data.currency_name}')
print(f'currency code:    {data.currency_code}\n')

print(f'email:            {data.email}')
print(f'safe email:       {data.safe_email}')
print(f'work email:       {data.company_email}\n')

print(f'hostname:         {data.hostname}')
print(f'IPv4:             {data.ipv4}')
print(f'IPv6:             {data.ipv6}\n')

print(f'text:\n{data.text}')

id:               9625
name:             Keith Richards
job:              Geoscientist

address:
Unit 9045 Box 8283
DPO AA 44367

country:          Solomon Islands
currency name:    Swedish krona
currency code:    DOP

email:            feliciajackson@example.org
safe email:       morenosandra@example.org
work email:       billy49@mitchell.com

hostname:         lt-55.fleming-dominguez.biz
IPv4:             162.55.78.206
IPv6:             312f:3ac8:f656:651:11a4:284b:7d8f:7bf1

text:
Bank add up story. Service north animal nature.
Most because growth day. Soon me station set pattern move. Identify course economy it participant reduce result.


Create a collection of an anonymous class

In [8]:
class SimpleClass:
    id = 0
    text = 'test'

classes = Autodata.create_many(SimpleClass)
for cls in classes:
  print(f'id = {cls.id}')
  print(f'text = {cls.text}')
  print()

id = 3675
text = 9ad2c33e-6e71-489d-a703-9376d8c68867

id = 4279
text = 8044a0c4-bedb-4e96-b441-2f78e84b64ef

id = 6627
text = 67aab9c1-b8fc-49f0-b124-8d3bdee14e5e



Create an anonymous class with nested types

In [9]:
class NestedClass:
    id = 0
    text = 'test'
    inner = SimpleClass()

cls = Autodata.create(NestedClass)
print(f'id = {cls.id}')
print(f'text = {cls.text}')
print(f'inner.id = {cls.inner.id}')
print(f'inner.text = {cls.inner.text}')

id = 5506
text = 36606555-95b6-4145-bdd8-c6872198d75a
inner.id = 9457
inner.text = e7f2761a-052c-40d6-a5c8-580a6bab418e


Create a collection of an anonymous class with nested types

In [10]:
class NestedClass:
    id = 0
    text = 'test'
    inner = SimpleClass()

classes = Autodata.create_many(NestedClass)
for cls in classes:
  print(f'id = {cls.id}')
  print(f'text = {cls.text}')
  print(f'inner.id = {cls.inner.id}')
  print(f'inner.text = {cls.inner.text}')

id = 5123
text = 20018a82-62db-4318-ac65-e98e8df47f80
inner.id = 4855
inner.text = 21f657bc-b850-4a41-829c-ae31c9f490b4
id = 2991
text = 3e54c1ad-ba97-4989-a8c0-3270f9ef195d
inner.id = 3243
inner.text = 4c63aef5-0867-4a61-95ac-5a6200cf9c8b
id = 3218
text = a2d00017-9796-4447-8295-2ed0b8a32e65
inner.id = 2240
inner.text = f4cafbe3-8f51-4a5b-ac17-3cb616de8160


Create a Pandas DataFrame using anonymous data generated from a specified type

In [11]:
class DataClass:
    id = 0
    type = '' 
    value = 0

pdf = Autodata.create_pandas_dataframe(DataClass)
print(pdf)

     id                                  type  value
0   881  9a5a8f4c-4daf-4334-aef4-d1a9826e8cba   3295
1  8641  39fe5652-3315-4bff-a76e-1fa18772bab3   6725
2  8241  7a51ce7c-d831-494f-a4d1-d193c29d2f74   5868


Create a Pandas DataFrame using fake data generated from a specified type

In [12]:
@dataclass
class DataClass:
    id: int
    first_name: str
    last_name: str
    phone_number: str

pdf = Autodata.create_pandas_dataframe(DataClass, use_fake_data=True)
print(pdf)

  first_name    id last_name         phone_number
0     Joseph  7885   Parrish   153.993.0504x99360
1    Zachary   948      Sims     001-955-065-0542
2       Ryan  5349  Browning  +1-102-386-2956x440


Create a Spark DataFrame using anonymous data generated from a specified type

In [13]:
@dataclass
class DataClass:
    id: int
    first_name: str
    last_name: str
    phone_number: str

df = Autodata.create_spark_dataframe(DataClass)
df.printSchema()
df.show()

root
 |-- first_name: string (nullable = true)
 |-- id: long (nullable = true)
 |-- last_name: string (nullable = true)
 |-- phone_number: string (nullable = true)

+--------------------+----+--------------------+--------------------+
|          first_name|  id|           last_name|        phone_number|
+--------------------+----+--------------------+--------------------+
|4b776e0b-4842-4c5...|8740|d52920d3-26ce-406...|db3a4fdb-c3d1-476...|
|4c53be35-91ed-435...|2641|c9644629-068b-490...|77345c0a-49ce-48d...|
|18579ebe-92bd-452...|1928|41d46a78-c10e-40b...|8e5eb120-17f7-415...|
+--------------------+----+--------------------+--------------------+



Create a Spark DataFrame using fake data generated from a specified type

In [14]:
@dataclass
class DataClass:
    id: int
    first_name: str
    last_name: str
    job: str

df = Autodata.create_spark_dataframe(DataClass, use_fake_data=True)
df.printSchema()
df.show()

root
 |-- first_name: string (nullable = true)
 |-- id: long (nullable = true)
 |-- job: string (nullable = true)
 |-- last_name: string (nullable = true)

+----------+---+--------------------+---------+
|first_name| id|                 job|last_name|
+----------+---+--------------------+---------+
|     Emily|793|Sports administrator| Hamilton|
|  Nicholas|495| Corporate treasurer| Davidson|
|     Diane|476| Marketing executive|  Aguilar|
+----------+---+--------------------+---------+

