-
Notifications
You must be signed in to change notification settings - Fork 0
Big_Data_Programming_ICP_4_Module2
- Spark Streaming using Log File Generator.
- Write a spark word count program of Spark Streaming received from a data server listening on a TCP socket.
- Spark Streaming for Character Frequency using TCP Socket.
Here, lorem.txt is considered as input and file.py is used for generating and creating log files from this text file.
Initially execute the streaming.py file and then execute the file.py which generates the log files and saves them in the log folder.
Initially, start the port 6000 using the netcat command in cmd and then execute the wordcount.py program.
Then, any lines typed in the terminal running the netcat server will be counted and printed as output every second.
This is similar to the previous part of this ICP, the only difference is that in the first part we count the words, but here we count the individual characters.
Initially, start the port 6000 using the netcat command in cmd and then execute the characterfreq.py program.
Then, any lines typed in the terminal running the netcat server will be counted and printed as output every second.
https://spark.apache.org/docs/2.2.0/streaming-programming-guide.html