Skip to content

ChetanKnowIt/BDT_Notes

Repository files navigation

BDT NOTES

This repository for all big data related content I will be studying.

Table of Contents:

BDA

Day 2

  • RAID
  • Java heap and Java basics
  • installation and configuration of hadoop
  • the 1st spreadsheet software
  • definition of IoT
  • Self Driving cars
  • datacuration
  • Linux namespaces and linux commands

Day 3

  • unix history
  • IBM big data metrics
  • throughput
  • concurrency
  • Notes

Day 4

  • hadoop commands
  • mapreduce introduction
  • sdlc
  • check Notes

Day 5

  • mapper and reducer with python
  • MapReduce java example for wordcount
  • check Notes

Day 6

  • more exercises
  • MapReduce for Ages
  • MapReduce for Marks
  • click here

Day 7

  • MapReduce for internet usage in usage
  • MapReduce for facebook likes on video for Feb, 2018 in insight
  • click here

Day 8

Day 9

  • mintemp, maxtemp, forest, funniest post
  • all codes no notes -> here

Day 10

Day 11

  • MR streaming code for python check here
  • With Rscript in r_demo
  • Wordcount in R
  • Multi Node Setup

Day 12

  • Big data ecosystem
  • Hive
  • Notes -> here

Day 13

  • hive commands and operations
  • python with hive
  • Notes -> here

Day 14

  • python with hive
  • built in functions of hive
  • join, subquery
  • Notes -> here

Day 15

  • NoSQL
  • HBase introduction
  • Notes -> here

Day 16

  • Hbase setup
  • Hbase commands
  • Notes -> here

Day 17

  • Spark introduction
  • spark-shell practical
  • Difference between RDD vs Dataframe vs Dataset
  • Notes -> here

Day 18

  • pyspark colab
  • Discussion on Data Analytics
  • Notes -> here

Day 19