# Chapter 4: Developing Your First Ruby Application

<div id="toc"></div>

## 4.1 Working with Source Code Files

### 4.1.1 Creating a Test File

* __Visual Studio Code__

https://www.jetbrains.com/ruby/

* __Alternatives to Linux__

### 4.1.2 A Simple Source Code File

In [1]:
x = 2
print "This program is running okay if 2 + 2 = #{x + x}"

This program is running okay if 2 + 2 = 4

### 4.1.3 Running Your Source Code

* __Windows__

* __Mac OS X / macOS__

* __Linux and Other UNIX-Based Systems__

## 4.2 Our Application: A Text Analyzer

### 4.2.1 Required Basic Features

### 4.2.2 Building the Basic Application

### 4.2.3 Obtaining Some Dummy Text

http://www.rubyinside.com/book/

http://www.rubyinside.com/book/oliver.txt 

### 2.1.4 Loading Text Files and Counting Lines

In [None]:
File.open("text.txt").each { |line| puts line }

In [None]:
line_count = 0
File.open("text.txt").each { |line| line_count += 1 }
puts line_count

In [None]:
text=''
line_count = 0
File.open("text.txt").each do |line|
  line_count += 1
  text << line
end
puts "#{line_count} lines"

In [None]:
lines = File.readlines("text.txt")
line_count = lines.size
text = lines.join
puts "#{line_count} lines"

### 4.2.5 Counting Characters

In [None]:
total_characters = text.length
puts "#{total_characters} characters"

In [None]:
"this is a test".gsub(/t/, 'X')

In [None]:
total_characters_nospaces = text.gsub(/\s+/, '').length
puts "#{total_characters_nospaces} characters excluding spaces"

### 4.2.6 Counting Words

In [2]:
puts "this is a test".scan(/\w/).join

thisisatest


In [3]:
puts "this is a test".scan(/\w+/).join('-')

this-is-a-test


In [4]:
puts "this is a test".scan(/\w+/).length

4


In [5]:
puts "this is a test".split.length

4


In [6]:
text = "First-class decisions require clear-headed thinking."
puts "Scan method: #{text.scan(/\w+/).length}"
puts "Split method: #{text.split.length}"

Scan method: 7
Split method: 5


In [7]:
word_count = text.split.length
puts "#{word_count} words"

5 words


### 4.2.7 Counting Sentences and Paragraphs

In [8]:
sentence_count = text.split(/\.|\?|!/).length

1

In [9]:
puts "Test code! It works. Does it? Yes.".split(/\.|\?|!/).length

4


In [10]:
text = %q{
This is a test of
paragraph one.
This is a test of
paragraph two.
This is a test of
paragraph three.
}
puts text.split(/\n\n/).length

1


In [11]:
paragraph_count = text.split(/\n\n/).length
puts "#{paragraph_count} paragraphs"

1 paragraphs


In [12]:
sentence_count = text.split(/\.|\?|!/).length
puts "#{sentence_count} sentences"

4 sentences


### 4.2.8 Calculating Averages

In [13]:
puts "#{sentence_count / paragraph_count} sentences per paragraph (average)"
puts "#{word_count / sentence_count} words per sentence (average)"

4 sentences per paragraph (average)
1 words per sentence (average)


### 4.2.9 The Source Code So Far

In [None]:
lines = File.readlines("text.txt")
line_count = lines.size
text = lines.join
word_count = text.split.length
character_count = text.length
character_count_nospaces = text.gsub(/\s+/, '').length
paragraph_count = text.split(/\n\n/).length
sentence_count = text.split(/\.|\?|!/).length
puts "#{line_count} lines"
puts "#{character_count} characters"
puts "#{character_count_nospaces} characters excluding spaces"
puts "#{word_count} words"
puts "#{paragraph_count} paragraphs"
puts "#{sentence_count} sentences"
puts "#{sentence_count / paragraph_count} sentences per paragraph (average)"
puts "#{word_count / sentence_count} words per sentence (average)"

## 4.3 Adding Extra Features

### 4.3.1 Percentage of “Useful” Words

* Note For more information about stop words, including links to complete lists, visit 
http://en.wikipedia.org/wiki/Stop_words .

In [15]:
stopwords = %w{the a by on for of are with just but and to the my I has some in}

["the", "a", "by", "on", "for", "of", "are", "with", "just", "but", "and", "to", "the", "my", "I", "has", "some", "in"]

In [16]:
text = %q{Los Angeles has some of the nicest weather in the country.}
stopwords = %w{the a by on for of are with just but and to the my in I has some}

["the", "a", "by", "on", "for", "of", "are", "with", "just", "but", "and", "to", "the", "my", "in", "I", "has", "some"]

In [17]:
words = text.scan(/\w+/)
keywords = words.select { |word| !stopwords.include?(word) }

["Los", "Angeles", "nicest", "weather", "country"]

In [18]:
puts keywords.join(' ')

Los Angeles nicest weather country


In [19]:
keywords = words.select { |word| !stopwords.include?(word) }

["Los", "Angeles", "nicest", "weather", "country"]

In [20]:
((keywords.length.to_f / words.length.to_f) * 100).to_i

45

In [None]:
stopwords = %w{the a by on for of are with just but and to the my I has some in}
lines = File.readlines(“text.txt”)
line_count = lines.size
text = lines.join

In [None]:
# Count the words, characters, paragraphs and sentences
word_count = text.split.length
character_count = text.length
character_count_nospaces = text.gsub(/\s+/, '').length
paragraph_count = text.split(/\n\n/).length
sentence_count = text.split(/\.|\?|!/).length
# Make a list of words in the text that aren't stop words,
# count them, and work out the percentage of non-stop words
# against all words
all_words = text.scan(/\w+/)
good_words = all_words.reject{ |word| stopwords.include?(word) }
good_percentage = ((good_words.length.to_f / all_words.length.to_f) * 100).to_i
# Give the analysis back to the user
puts "#{line_count} lines"
puts "#{character_count} characters"
puts "#{character_count_nospaces} characters (excluding spaces)"
puts "#{word_count} words"
puts "#{sentence_count} sentences"
puts "#{paragraph_count} paragraphs"
puts "#{sentence_count / paragraph_count} sentences per paragraph (average)"
puts "#{word_count / sentence_count} words per sentence (average)"
puts "#{good_percentage}% of words are non-fluff words"

### 4.3.2 Summarizing by Finding “Interesting” Sentences

In [None]:
text = %q{
Ruby is a great programming language. It is object oriented
and has many groovy features. Some people don't like it, but that's
not our problem! It's easy to learn. It's great. To learn more about Ruby,
visit the official Ruby web site today.
}
sentences = text.gsub(/\s+/, ' ').strip.split(/\.|\?|!/)
sentences_sorted = sentences.sort_by { |sentence| sentence.length }
one_third = sentences_sorted.length / 3
ideal_sentences = sentences_sorted.slice(one_third, one_third + 1)
ideal_sentences = ideal_sentences.select { |sentence| sentence =~ /is|are/ }
puts ideal_sentences.join(". ")

In [22]:
sentences = text.gsub(/\s+/, ' ').strip.split(/\.|\?|!/)

["Los Angeles has some of the nicest weather in the country"]

In [23]:
sentences_sorted = sentences.sort_by { |sentence| sentence.length }

["Los Angeles has some of the nicest weather in the country"]

In [24]:
one_third = sentences_sorted.length / 3
ideal_sentences = sentences_sorted.slice(one_third, one_third + 1)

["Los Angeles has some of the nicest weather in the country"]

In [25]:
ideal_sentences = ideal_sentences.select { |sentence| sentence =~ /is|are/ }

[]

In [26]:
puts ideal_sentences.join(". ")




### 4.3.3 Analyzing Files Other Than text.txt

In [27]:
puts ARGV.join('-')

kernel-C:\Users\AaronHsu\AppData\Roaming\jupyter\runtime\kernel-e2dcf565-80c4-45fb-be91-3877ff1f9de0.json


In [None]:
lines = File.readlines(ARGV[0])

## 4.4 The Completed Program

*  Note Remember that source code for this book is available in the Source Code area at http://www.apress.com , so it isn’t strictly necessary to type in code directly from the book.

In [None]:
# analyzer.rb -- Text Analyzer
stopwords = %w{the a by on for of are with just but and to the my I has some in}
lines = File.readlines(ARGV[0])
line_count = lines.size
text = lines.join
# Count the words, characters, paragraphs and sentences
word_count = text.split.length
character_count = text.length
character_count_nospaces = text.gsub(/\s+/, '').length
paragraph_count = text.split(/\n\n/).length
sentence_count = text.split(/\.|\?|!/).length
# Make a list of words in the text that aren't stop words,
# count them, and work out the percentage of non-stop words
# against all words
all_words = text.scan(/\w+/)
good_words = all_words.reject{ |word| stopwords.include?(word) }
good_percentage = ((good_words.length.to_f / all_words.length.to_f) * 100).to_i
# Summarize the text by cherry picking some choice
sentences = text.gsub(/\s+/, ' ').strip.split(/\.|\?|!/)
sentences_sorted = sentences.sort_by { |sentence| sentence.length }
one_third = sentences_sorted.length / 3
ideal_sentences = sentences_sorted.slice(one_third, one_third + 1)
ideal_sentences = ideal_sentences.select { |sentence| sentence =~ /is|are/ }
# Give the analysis back to the user
puts "#{line_count} lines"
puts "#{character_count} characters"
puts "#{character_count_nospaces} characters (excluding spaces)"
puts "#{word_count} words"
puts "#{sentence_count} sentences"
puts "#{paragraph_count} paragraphs"
puts "#{sentence_count / paragraph_count} sentences per paragraph (average)"
puts "#{word_count / sentence_count} words per sentence (average)"
puts "#{good_percentage}% of words are non-fluff words"
puts "Summary:\n\n" + ideal_sentences.join(". ")
puts "-- End of analysis"

In [None]:
puts “2+2 = #{2+2}” # Adds 2+2 to make 4
# A comment on a line by itself

## 4.5 Summary