Skip to content

Code samples for my book "Computing with Data: An Introduction to the Data Industry"

Notifications You must be signed in to change notification settings


Repository files navigation

Computing with Data


A Guide Book to the Data Industry

  • Introduces programmers to data science concepts and practices through new apparatuses such as Python programming and data processing techniques
  • Explains prevalent programming languages and data processing systems that are commonly used to address engineering challenges
  • Explores new tools and libraries to use in big data projects
  • Presents principles that can be employed in applications ranging from software simulations to real-world web applications that serve millions of users
  • Contains a plethora of examples used to explain various interconnected concepts; most of the code samples on this website are editable and runnable in an interactive web-based IDE; for example, try editing this code snippet then run it.

You may also choose to run code samples using our Docker images, which come with dependencies, tools, and code samples pre-installed.

Computing with Data introduces basic computing skills designed for industry professionals without a strong computer science background. Written in an easily accessible manner, it serves as a self-study guide to survey data science and data engineering for those who aspire to start a computing career or expand on their current roles, in areas such as applied statistics, big data, machine learning, data mining, and informatics. The authors draw from their combined experience working at software and social network companies, on big data products at several major online retailers, as well as their experience building big data systems for an AI startup. Spanning from the basic inner workings of a computer to advanced data manipulation techniques, this book opens doors for readers to quickly explore and enhance their computing knowledge.

Computing with Data comprises a wide range of computational topics essential for data scientists, analysts, and engineers, providing them with the necessary tools to be successful in any role that involves computing with data. The introduction is self-contained, and chapters progress from basic hardware concepts to operating systems, programming languages, graphing and processing data, testing and programming tools, big data frameworks, and cloud computing. The book is fashioned with several audiences in mind. Readers without a strong educational background in CS — or those who need a refresher — will find the chapters on hardware, operating systems, and programming languages particularly useful.