- URL I used is "https://en.wikipedia.org/wiki/Big_data"
- Curl command used to return the page text:
curl "https://en.wikipedia.org/wiki/Big_data" - Curl command used to get data and output to a file:
curl "https://en.wikipedia.org/wiki/Big_data" -O data.txt
- Bash commands are used to process the text.
tr ' ' '\12' < data.txtis used to divide each line into individual words.- Pipe is defined as sending the results of one command as input to another command.
tr ' ' '\12' < data.txt | sortis used to sort the text.tr ' ' '\12' < data.txt | sort | uniq -cis used to pipe the sorted output to uniq -c to count. "-c" flag is to check for sorted input.tr ' ' '\12' < data.txt | sort | uniq -c | sort -nris used to pipe the reduced output that is after getting unique words to sort with -nr flag.tr ' ' '\12' < data.txt | sort | uniq -c | sort -nr > result.txtis used to redirect the output to result.txt file.- Up arrow in bash shell is used to get previous commands we used.
sort --helpdisplays mandatory arguments that are to be used while sorting with bash commands.- "-n" flag is used to sort the data by comparing with numerical value.
- "-r" flag is used to reverse the results of comparisons.
- Only one dash is used when it is a single letter flag.
- Two dashes are used when it is a flag with more than one letter.
- To redirect all the contents of our directory
ls > filename.txtcommand is used(file name is of our choice). - we can use two arrows(>>) to append instead of overwriting.
lscommand is used to list all the contents of the default directory.- we can use cat command to diplay the contents of the file.
Example:
cat temp.txt