Skip to content

PacktPublishing/Data-Exploration-and-Preparation-with-BigQuery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Exploration and Preparation with BigQuery

This is the code repository for Data Exploration and Preparation with BigQuery , published by Packt.

A practical guide to cleaning, transforming, and analyzing data for business insights

What is this book about?

Data professionals encounter a multitude of challenges such as handling large volumes of data, dealing with data silos, and the lack of appropriate tools. Datasets often arrive in different conditions and formats, demanding considerable time from analysts, engineers, and scientists to process and uncover insights. The complexity of the data life cycle often hinders teams and organizations from extracting the desired value from their data assets. Data Exploration and Preparation with BigQuery offers a holistic solution to these challenges.

  • This book covers the following exciting features:
  • Assess the quality of a dataset and learn best practices for data cleansing
  • Prepare data for analysis, visualization, and machine learning
  • Explore approaches to data visualization in BigQuery
  • Apply acquired knowledge to real-life scenarios and design patterns
  • Set up and organize BigQuery resources
  • Use SQL and other tools to navigate datasets
  • Implement best practices to query BigQuery datasets
  • Gain proficiency in using data preparation tools, techniques, and strategies

If you feel this book is for you, get your copy today!

https://www.packtpub.com/

Instructions and Navigations

All of the code is organized into folders. For example, Chapter04.

The code will look like the following:

CREATE TABLE `ch11.jewelry_sales_data2` (date DATE, order_id INT, 
product_id INT, quantity INT, category_id INT, category_name STRING, 
brand_id INT, price FLOAT64, gender STRING, metal STRING, stone 
STRING)

Following is what you need for this book: This book is for data analysts seeking to enhance their data exploration and preparation skills using BigQuery. It guides anyone using BigQuery as a data warehouse to extract business insights from large datasets. A basic understanding of SQL, reporting, data modeling, and transformations will assist with understanding the topics covered in this book.

With the following software and hardware list you can run all code files present in the book (Chapter 3-13).

Software and Hardware List

Chapter Software required OS required
3-13 Google Cloud Windows, macOS, or ChromeOS
3-13 GoogleSQL, SQL Windows, macOS, or ChromeOS
3-13 Dataprep by Trifacta Windows, macOS, or ChromeOS

Related products

Get to Know the Author

Mike Kahn is a data and infrastructure enthusiast and currently leads a Customer Engineering team at Google Cloud. Prior to Google, Mike spent five years in solution architecture roles and worked in operations and leadership roles in the data center industry. His over 15 years of experience have given him a deep knowledge of data and infrastructure engineering, operations, strategy, and leadership. Mike holds multiple Google Cloud certifications and is a lifelong learner. He is based in Boca Raton, Florida, in the US and holds a Bachelor of Science degree and a Master of Science degree in Management Information Systems (MIS) from Florida International University.