# Project 2: Topic Classification

In this project, you'll work with text data from newsgroup postings on a variety of topics. You'll train classifiers to distinguish between the topics based on the text of the posts. Whereas with digit classification, the input is relatively dense: a 28x28 matrix of pixels, many of which are non-zero, here we'll represent each document with a "bag-of-words" model. As you'll see, this makes the feature representation quite sparse -- only a few words of the total vocabulary are active in any given document. The bag-of-words assumption here is that the label depends only on the words; their order is not important.

The SK-learn documentation on feature extraction will prove useful:
http://scikit-learn.org/stable/modules/feature_extraction.html

Each problem can be addressed succinctly with the included packages -- please don't add any more. Grading will be based on writing clean, commented code, along with a few short answers.

As always, you're welcome to work on the project in groups and discuss ideas on the course wall, but please prepare your own write-up and write your own code.

In [1]:
# This tells matplotlib not to try opening a new window for each plot.
%matplotlib inline

# General libraries.
import re
import numpy as np
import matplotlib.pyplot as plt

# SK-learn libraries for learning.
from sklearn.pipeline import Pipeline
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import BernoulliNB
from sklearn.naive_bayes import MultinomialNB
from sklearn.grid_search import GridSearchCV

# SK-learn libraries for evaluation.
from sklearn.metrics import confusion_matrix
from sklearn import metrics
from sklearn.metrics import classification_report

# SK-learn library for importing the newsgroup data.
from sklearn.datasets import fetch_20newsgroups

# SK-learn libraries for feature extraction from text.
from sklearn.feature_extraction.text import *

Load the data, stripping out metadata so that we learn classifiers that only use textual features. By default, newsgroups data is split into train and test sets. We further split the test so we have a dev set. Note that we specify 4 categories to use for this project. If you remove the categories argument from the fetch function, you'll get all 20 categories.

In [99]:
categories = ['alt.atheism', 'talk.religion.misc', 'comp.graphics', 'sci.space']
newsgroups_train = fetch_20newsgroups(subset='train',
                                      remove=('headers', 'footers', 'quotes'),
                                      categories=categories)
newsgroups_test = fetch_20newsgroups(subset='test',
                                     remove=('headers', 'footers', 'quotes'),
                                     categories=categories)

num_test = len(newsgroups_test.target)
test_data, test_labels = newsgroups_test.data[num_test/2:], newsgroups_test.target[num_test/2:]
dev_data, dev_labels = newsgroups_test.data[:num_test/2], newsgroups_test.target[:num_test/2]
train_data, train_labels = newsgroups_train.data, newsgroups_train.target

print 'training label shape:', train_labels.shape
print 'test label shape:', test_labels.shape
print 'dev label shape:', dev_labels.shape
print 'labels names:', newsgroups_train.target_names

training label shape: (2034L,)
test label shape: (677L,)
dev label shape: (676L,)
labels names: ['alt.atheism', 'comp.graphics', 'sci.space', 'talk.religion.misc']
/n


(1) For each of the first 5 training examples, print the text of the message along with the label.

[2 pts]

In [101]:
def P1(num_examples):
### STUDENT START ###
    for i in range(0,num_examples):
        print "example No: " + str(i+1)
        #show the label for the message
        print "label: " + str(train_labels[i])
        #show the category
        print "category: " + str(newsgroups_train.target_names[train_labels[i]])
        #show the text for the message
        print "text: " + str(train_data[i])+ "\n"
        print "--------------------------------------"

### STUDENT END ###
P1(5)

example No: 1
label: 1
category: comp.graphics
text: Hi,

I've noticed that if you only save a model (with all your mapping planes
positioned carefully) to a .3DS file that when you reload it after restarting
3DS, they are given a default position and orientation.  But if you save
to a .PRJ file their positions/orientation are preserved.  Does anyone
know why this information is not stored in the .3DS file?  Nothing is
explicitly said in the manual about saving texture rules in the .PRJ file. 
I'd like to be able to read the texture rule information, does anyone have 
the format for the .PRJ file?

Is the .CEL file format available from somewhere?

Rych

--------------------------------------
example No: 2
label: 3
category: talk.religion.misc
text: 

Seems to be, barring evidence to the contrary, that Koresh was simply
another deranged fanatic who thought it neccessary to take a whole bunch of
folks with him, children and all, to satisfy his delusional mania. Jim
Jones, circa 1993.




(2) Use CountVectorizer to turn the raw training text into feature vectors. You should use the fit_transform function, which makes 2 passes through the data: first it computes the vocabulary ("fit"), second it converts the raw text into feature vectors using the vocabulary ("transform").

The vectorizer has a lot of options. To get familiar with some of them, write code to answer these questions:

a. The output of the transform (also of fit_transform) is a sparse matrix: http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.csr_matrix.html. What is the size of the vocabulary? What is the average number of non-zero features per example? What fraction of the entries in the matrix are non-zero? Hint: use "nnz" and "shape" attributes.

b. What are the 0th and last feature strings (in alphabetical order)? Hint: use the vectorizer's get_feature_names function.

c. Specify your own vocabulary with 4 words: ["atheism", "graphics", "space", "religion"]. Confirm the training vectors are appropriately shaped. Now what's the average number of non-zero features per example?

d. Instead of extracting unigram word features, use "analyzer" and "ngram_range" to extract bigram and trigram character features. What size vocabulary does this yield?

e. Use the "min_df" argument to prune words that appear in fewer than 10 documents. What size vocabulary does this yield?

f. Using the standard CountVectorizer, what fraction of the words in the dev data are missing from the vocabulary? Hint: build a vocabulary for both train and dev and look at the size of the difference.

[6 pts]

In [179]:
def P2():
### STUDENT START ###
    ########################################################
    #Part A
    #singleGram
    vectorizer = CountVectorizer()
    #bin words, 
    X = vectorizer.fit_transform(train_data)
    #Convert to matrix
    #X.toarray()
    print "Question 1a"
    print "The size of the vocabulary (single grams) is:  " + str(X.shape[1])
    
    print "The average number of non-zero features per example is: ",
    #calc missing values over total number of examples
    print X.nnz/float(X.shape[0]) #shows 196,700 words
    
    print "The fraction of the entries in the matrix that are non-zero is:",
    #Calc Sparsity 
    print str((X.nnz)/float(X.shape[0]*X.shape[1]))+ "\n"
    
    ########################################################
    #Part B
    print "Question 1b" 
    feature_names =vectorizer.get_feature_names()
    print "The 0th feature is: " + str((feature_names[0]))
    print "The last feature is: " + str(feature_names[-1]) +"\n"
    
    ########################################################
    #Part C
    print "Question 1c" 
    #Create new vocab list
    vocab = ["atheism", "graphics", "space", "religion"]
    #binarize it
    vocab_array = vectorizer.fit_transform(vocab) #.toarray()
    #apply bins to training dataset
    vocab_2_data = vectorizer.transform(train_data) #.toarray()
    #calc blank rate and overall shape
    print "The average number of non-zero features per example (with new 4 word vocab) is: ",
    print  str((vocab_2_data.nnz/float(vocab_2_data.shape[0]))) + "\n"
        
    ########################################################
    #Part D
    print "Question 1d" 
    #for words, Bi and TriGrams
    bigram_vectorizer = CountVectorizer(analyzer='char',ngram_range=(2, 3),token_pattern=r'\b\w+\b', min_df=1)

    #bin words and convert to matrix
    X_2 = bigram_vectorizer.fit_transform(train_data)#.toarray()
    X_2
    print "Bigrams and trigrams result in a vocabulary of size: " + str(X_2.shape[1]) + "\n"
    
    ########################################################
    #Part E
    print "Question 1e" 
    #for words, Bi and TriGrams
    bigram_vectorizer = CountVectorizer(min_df=10)

    #bin words and convert to matrix
    X_2 = bigram_vectorizer.fit_transform(train_data)#.toarray()
    X_2
    print "Words with at least 10 instances result in a vocabulary of size: " + str(X_2.shape[1]) + "\n"
    
    ########################################################
    #Part F
    print "Question 1f" 
    X = vectorizer.fit_transform(dev_data)
    dev_feature_names =vectorizer.get_feature_names()
    missing = 0
    for i in range(0,len(dev_feature_names)):
        if dev_feature_names[i] not in (feature_names):
            missing += 1
    print "The fraction of the words in the dev data  missing from the vocabulary is:  " + str(float(missing)/X.shape[1])
    
    ### STUDENT END ###
P2()

Question 1a
The size of the vocabulary (single grams) is:  26879
The average number of non-zero features per example is:  96.7059980334
The fraction of the entries in the matrix that are non-zero is: 0.00359782722696

Question 1b
The 0th feature is: 00
The last feature is: zyxel

Question 1c
The average number of non-zero features per example (with new 4 word vocab) is:  0.268436578171

Question 1d
Bigrams and trigrams result in a vocabulary of size: 35478

Question 1e
Words with at least 10 instances result in a vocabulary of size: 3064

Question 1f
4027
The fraction of the words in the dev data  missing from the vocabulary is:  0.247876400345


(3) Use the default CountVectorizer options and report the f1 score (use metrics.f1_score) for a k nearest neighbors classifier; find the optimal value for k. Also fit a Multinomial Naive Bayes model and find the optimal value for alpha. Finally, fit a logistic regression model and find the optimal value for the regularization strength C using l2 regularization. A few questions:

a. Why doesn't nearest neighbors work well for this problem?

b. Any ideas why logistic regression doesn't work as well as Naive Bayes?

c. Logistic regression estimates a weight vector for each class, which you can access with the coef\_ attribute. Output the sum of the squared weight values for each class for each setting of the C parameter. Briefly explain the relationship between the sum and the value of C.

[4 pts]

In [338]:
def P3():
    ### STUDENT START ###
    ##################################################
    #create n grams
    #default CountVectorizer options 
    #singleGram
    vectorizer = CountVectorizer()
    train_vec = vectorizer.fit_transform(train_data).toarray()
    #train_vec.shape
    #train_array = train_vec.toarray()
    #print train_array [1:20,]
    test_vec = vectorizer.transform(test_data)
    #print test_vec.shape

    ##################################################
    #Build KNN model
    #report the f1 score (use metrics.f1_score) for a k nearest neighbors classifier;
    #find the optimal value for k. 
    knn_model = KNeighborsClassifier()
    n_neighbors = {'n_neighbors': [1, 3, 5, 7, 9, 15]} 
    #sklearn library required a more explicit measure (ie f1 alone implies binary results and this is a multi class problem)
    grid = GridSearchCV(estimator=knn_model, param_grid=n_neighbors, scoring = "f1_weighted")
    grid.fit(train_vec, train_labels)

    # summarize the results of the grid search
    print "KNN best f1 score on Training Data: " + str(grid.best_score_)
    print "KNN best nearest neighbor count: " + str(grid.best_estimator_.n_neighbors) + "\n"

    ##################################################
    #Build MNB model
    #Multinomial Naive Bayes model and find the optimal value for alpha. 
    #gnb_model = BernoulliNB(binarize=0.5) #binarizer function is cutting off here!!
    alphas = {'alpha': [0.0001, 0.001, 0.01, 0.1, 0.5, 1.0, 2.0, 10.0]}
    grid = GridSearchCV(estimator=clf_model, param_grid=alphas, scoring = "f1_weighted")
    grid.fit(train_vec, train_labels)
    #print(grid)
    # summarize the results of the grid search
    print "Multinomial NB Model best F1: " + str((grid.best_score_))
    print "Multinomial NB Model best Alpha: " +str (grid.best_estimator_.alpha) + "\n"


    #clf_pred = clf_model.fit(train_vec, train_labels, alpha = grid.best_estimator_.alpha).predict(test_vec)
    #wrong_prediction = (clf_pred != test_labels)
    #print "Multinomial NB Model - Score on Test Pred: ",
    #print clf_model.metrics.f1_score


    ##################################################
    #Build LR model
    #fit a logistic regression model and find the optimal value for the regularization strength C using l2 regularization.
    #LR Model
    regularization_c = {'C': [0.0001, 0.001, 0.01, 0.1, 0.25, 0.5, 0.75, 0.9, 0.99,1]} #5,10,100]}
    LR_model = LogisticRegression(penalty="l2")
    grid = GridSearchCV(estimator=LR_model, param_grid=regularization_c, scoring = "f1_weighted")
    grid.fit(train_vec, train_labels)
    #summarize the results of the grid search
    print "Logistic Regression best f1 score for training data: " + str(grid.best_score_)
    print "Logistic Regression best regularization parameter: "+ str(grid.best_estimator_.C)


### STUDENT END ###
P3()

KNN best f1 score on Training Data: 0.417778915182
KNN best nearest neighbor count: 15

Multinomial NB Model best F1: 0.828718198342
Multinomial NB Model best Alpha: 0.01

Logistic Regression best f1 score for training data: 0.767943492932
Logistic Regression best regularization parameter: 0.25


ANSWER:

a. Why doesn't nearest neighbors work well for this problem? Maybe there is too much noise with overlapping basic words that makes 2 pieces of text seem more similar than they really are.

b. Any ideas why logistic regression doesn't work as well as Naive Bayes? Logistic regression doesn't handle lots of features especially well. 

In [None]:
#LR Model
regularization_c = {'C': [0.0001, 0.001, 0.01, 0.1, 0.5, 0.9, 0.99,1,5,10,100]}
LR_model = LogisticRegression(penalty="l2")
grid = GridSearchCV(estimator=LR_model, param_grid=regularization_c, scoring = "f1_weighted")
grid.fit(train_vec, train_labels)
# summarize the results of the grid search
print "Logistic Regression best f1 score for training data: " + str(grid.best_score_)
print "Logistic Regression best regularization parameter: "+ str(grid.best_estimator_.C)

scores = [x[1] for x in grid.grid_scores_]
print 'scores',
print scores

print 'regularization_c',
print [0.0, 0.0001, 0.001, 0.01, 0.1, 0.5, 1.0, 2.0, 10.0]

#Logistic regression estimates a weight vector for each class, which you can access with the coef_ attribute. 
#Output the sum of the squared weight values for each class for each setting of the C parameter. 
#Briefly explain the relationship between the sum and the value of C.

In [337]:
def P3(regularization_c, accuracies):

### STUDENT START ###
    #iterate over different regularization weights
    for c_size in (regularization_c):
        #builds model
        LR_model = LogisticRegression(penalty="l2", C = c_size)
        LR_model.fit(train_vec, train_labels)
        
        #makes predictions on the dev dataset with the model I just built
        test_predicted_labels = LR_model.predict(test_vec)

        #accuracy Calcs
        wrong_prediction = (test_predicted_labels != test_labels)
        print "Regularization weight: " + str(c_size), 
        
        print LR_model.coef_[1:4]
        categories_coef = LR_model.coef_[0:4]
        for j in categories_coef:
            coef_SSE = sum((categories_coef[j]**2))
            print "SSE" + str(coef_SSE)
        #time_list.append(run_time)

### STUDENT END ###
regularization_c =  [0.0001, 0.001, 0.01, 0.1, 0.25, 0.5, 0.75, 0.9, 0.99,1]
accuracies = []
P3(regularization_c, accuracies)
print accuracies

Regularization weight: 0.0001 [[ -1.52533940e-03  -1.18724621e-03  -7.41337960e-05 ...,  -4.49042447e-05
   -8.98084894e-05  -3.63943428e-05]
 [  5.36222087e-04  -1.84070945e-03  -9.29400417e-05 ...,  -4.10522482e-05
   -8.21044964e-05   7.18034641e-05]
 [  6.53037643e-04   3.06177763e-03   9.30135863e-05 ...,   4.93317908e-05
    9.86635817e-05  -3.20281368e-05]
 [ -1.55153047e-03  -2.11939984e-03  -8.16528016e-05 ...,  -3.74927861e-05
   -7.49855723e-05  -3.47161226e-05]]


IndexError: arrays used as indices must be of integer (or boolean) type

ANSWER:
c. Logistic regression estimates a weight vector for each class, which you can access with the coef_ attribute. Output the sum of the squared weight values for each class for each setting of the C parameter. Briefly explain the relationship between the sum and the value of C.


(4) Train a logistic regression model. Find the 5 features with the largest weights for each label -- 20 features in total. Create a table with 20 rows and 4 columns that shows the weight for each of these features for each of the labels. Create the table again with bigram features. Any surprising features in this table?

[5 pts]

In [6]:
#def P4():
### STUDENT START ###
LR_model = LogisticRegression(penalty="l2", C = c_size)
LR_model.fit(train_vec, train_labels)

#makes predictions on the dev dataset with the model I just built
test_predicted_labels = LR_model.predict(test_vec)

#accuracy Calcs
wrong_prediction = (test_predicted_labels != test_labels)
print "Regularization weight: " + str(c_size), 

print LR_model.coef_[0:4]
categories_coef = LR_model.coef_[0:4]

### STUDENT END ###
#P4()

ANSWER:

(5) Try to improve the logistic regression classifier by passing a custom preprocessor to CountVectorizer. The preprocessing function runs on the raw text, before it is split into words by the tokenizer. Your preprocessor should try to normalize the input in various ways to improve generalization. For example, try lowercasing everything, replacing sequences of numbers with a single token, removing various other non-letter characters, and shortening long words. If you're not already familiar with regular expressions for manipulating strings, see https://docs.python.org/2/library/re.html, and re.sub() in particular. With your new preprocessor, how much did you reduce the size of the dictionary?

For reference, I was able to improve dev F1 by 2 points.

[4 pts]

In [395]:
stopwords = ['i','me','my','myself','we','our','ours','ourselves','you','your','yours','yourself','yourselves','he','him','his','himself','she','her','hers','herself','it','its','itself','they','them','their','theirs','themselves','what','which','who','whom','this','that','these','those','am','is','are','was','were','be','been','being','have','has','had','having','do','does','did','doing','a','an','the','and','but','if','or','because','as','until','while','of','at','by','for','with','about','against','between','into','through','during','before','after','above','below','to','from','up','down','in','out','on','off','over','under','again','further','then','once','here','there','when','where','why','how','all','any','both','each','few','more','most','other','some','such','no','nor','not','only','own','same','so','than','too','very','s','t','can','will','just','don','should','now']
#taken from https://github.com/postgres/postgres/tree/master/src/backend/snowball/stopwords
test2 = "I want to be done with this project now so that I can have a social life again"
word_list = test2.split()
' '.join([k for k in word_list if k not in stopwords])

'I want done project I social life'

In [395]:
stopwords = ['i','me','my','myself','we','our','ours','ourselves','you','your','yours','yourself','yourselves','he','him','his','himself','she','her','hers','herself','it','its','itself','they','them','their','theirs','themselves','what','which','who','whom','this','that','these','those','am','is','are','was','were','be','been','being','have','has','had','having','do','does','did','doing','a','an','the','and','but','if','or','because','as','until','while','of','at','by','for','with','about','against','between','into','through','during','before','after','above','below','to','from','up','down','in','out','on','off','over','under','again','further','then','once','here','there','when','where','why','how','all','any','both','each','few','more','most','other','some','such','no','nor','not','only','own','same','so','than','too','very','s','t','can','will','just','don','should','now']
#taken from https://github.com/postgres/postgres/tree/master/src/backend/snowball/stopwords

'I want done project I social life'

In [400]:
def empty_preprocessor(s):
    return s

def better_preprocessor(s):
### STUDENT START ###
    #to lower
    s = s.lower()
    #removes paragraphs
    s = re.sub(r'[^\w]', ' ',s)
    #replace all numbers
    s = re.sub("\d+", "NUM", s)
    #Remove special characters
    s = re.sub('^\s[^A-Za-z0-9]+',"",s)
    
    #white noise words; ubiquitious and of little value
    stopwords = ['i','me','my','myself','we','our','ours','ourselves','you','your','yours','yourself','yourselves','he','him','his','himself','she','her','hers','herself','it','its','itself','they','them','their','theirs','themselves','what','which','who','whom','this','that','these','those','am','is','are','was','were','be','been','being','have','has','had','having','do','does','did','doing','a','an','the','and','but','if','or','because','as','until','while','of','at','by','for','with','about','against','between','into','through','during','before','after','above','below','to','from','up','down','in','out','on','off','over','under','again','further','then','once','here','there','when','where','why','how','all','any','both','each','few','more','most','other','some','such','no','nor','not','only','own','same','so','than','too','very','s','t','can','will','just','don','should','now']

    #remove stop words
    print s
    ' '.join([k for k in s if k not in stopwords])
    print s
    return s


### STUDENT END ###

def P5():
### STUDENT START ###
    #################################################
    #Vanilla vectorizing
    vanilla_vectorizer = CountVectorizer(preprocessor=empty_preprocessor)
    vanilla_train_vec = vanilla_vectorizer.fit_transform(train_data).toarray()
    vanilla_test_vec = vanilla_vectorizer.transform(test_data)
    
    #vocab size
    print "No Preprocessing vocabulary size: " + str(vanilla_train_vec.shape[1])
    
    #Model with no preprocessing 
    vanilla_LR_model = LogisticRegression(penalty="l2", C = 0.2)
    vanilla_LR_model.fit(vanilla_train_vec, train_labels)
    
    #makes predictions on the dev dataset with the model I just built
    vanilla_test_pred_labels = vanilla_LR_model.predict(vanilla_test_vec)
    
    #accuracy Calcs (using f1 for multiclass problem + LR model)
    vanilla_f1 = metrics.f1_score(vanilla_test_pred_labels, test_labels)
    print "No Preprocessing  f1: " + str(vanilla_f1) 
    
    #################################################
    #Special Preprocessing ---seriously missing nltk library :(
    special_vectorizer = CountVectorizer(preprocessor=better_preprocessor)
    special_train_vec = special_vectorizer.fit_transform(train_data[1:3]).toarray()
    special_test_vec = special_vectorizer.transform(test_data)
    
    #vocab size
    print "Special Preprocessing vocabulary size: " + str(special_train_vec.shape[1])
    
    #Model with no preprocessing 
    special_LR_model = LogisticRegression(penalty="l2", C = 0.2)
    special_LR_model.fit(special_train_vec, train_labels[1:3])
    
    #makes predictions on the dev dataset with the model I just built
    special_test_pred_labels = special_LR_model.predict(special_test_vec)
    
    #accuracy Calcs (using f1 for multiclass problem + LR model)
    special_f1 = metrics.f1_score(special_test_pred_labels, test_labels)
    print "Special Preprocessing  f1: " + str(special_f1)

### STUDENT END ###
P5()

No Preprocessing vocabulary size: 33291


  sample_weight=sample_weight)


No Preprocessing  f1: 0.718026391583
seems to be  barring evidence to the contrary  that koresh was simply another deranged fanatic who thought it neccessary to take a whole bunch of folks with him  children and all  to satisfy his delusional mania  jim jones  circa NUM    nope   fruitcakes like koresh have been demonstrating such evil corruption for centuries 
seems to be  barring evidence to the contrary  that koresh was simply another deranged fanatic who thought it neccessary to take a whole bunch of folks with him  children and all  to satisfy his delusional mania  jim jones  circa NUM    nope   fruitcakes like koresh have been demonstrating such evil corruption for centuries 
in article  NUMaprNUM NUM NUM sq sq com   msb sq sq com  mark brader    mb                                                              so the mb  NUM figure seems unlikely to actually be anything but a perijove   jg sorry  _perijoves_   i m not used to talking this language   couldn t we just say periapsis 

 this is a good question   there are major blind spots in our understanding of what makes the earth habitable   for example  why does the earth s atmosphere have the concentration of oxygen it does   the naive answer is  photosynthesis   but this is clearly incomplete   photosynthesis by itself can t make the atmosphere oxygenated  as the oxygen produced is consumed when the plants decay or are eaten   what is needed is photosynthesis plus some mechanism to sequester some fraction of the resulting reduced material   on earth  this mechanism is burial in seafloor sediments of organic matter  mostly from oceanic sources   however  this burial requires continental sediments  in the deep ocean  the burial rate is so slow that most material is consumed before it can be sequestered    this suggests that a planet without large oceans  or a planet without continents undergoing weathering  will have a hard time accumulating an oxygen atmosphere   in particular  an all ocean planet may have a ha

the peaceful attempt to serve the warrant was met with gunfire  due process was not served because the branch davidians wanted it that way    you  think on that        milk is for babies  when you re a man  you drink beer    arnold
the peaceful attempt to serve the warrant was met with gunfire  due process was not served because the branch davidians wanted it that way    you  think on that        milk is for babies  when you re a man  you drink beer    arnold
discussion of pros and cons deleted    could someone give me the references to the llnl proposal   i ve been meaning to track it down in conjuntion with something i m working on   it s not  directly related to space stations  but i think many of the principles will  carry over  
discussion of pros and cons deleted    could someone give me the references to the llnl proposal   i ve been meaning to track it down in conjuntion with something i m working on   it s not  directly related to space stations  but i think many of the princi

this response originally fell into a bit bucket   i m reposting it just so bill doesn t think i m ignoring him    bill   i m sorry to have been busy lately and only just be getting around to this   apparently you have some fundamental confusions about atheism  i think many of these are well addressed in the famous faq   your generalisms are then misplaced    atheism needn t imply materialism  or the lack of an absolute moral system   however  i do tend to materialism and don t believe in absolute morality  so i ll answer your questions    an atheist judges value in the same way that a theist does  according to a personal understanding of morality   that i don t believe in an absolute one doesn t mean that i don t have one   i m just explicit  as in the line of postings you followed up  that when i express judgment on a moral issue i am basing my judgment on my own code rather than claiming that it is in some absolute sense good or bad  my moral code is not particular different from tha

archive name  jpeg faq last modified  NUM may NUM  this faq article discusses jpeg image compression   suggestions for additions and clarifications are welcome   new since version of NUM may NUM      added info on imageviewer for next    this article includes the following sections    NUM   what is jpeg   NUM   why use jpeg   NUM   when should i use jpeg  and when should i stick with gif   NUM   how well does jpeg compress images   NUM   what are good  quality  settings for jpeg   NUM   where can i get jpeg software       NUMa   canned  software  viewers  etc       NUMb  source code  NUM   what s all this hoopla about color quantization   NUM   how does jpeg work   NUM   what about lossless jpeg   NUM   why all the argument about file formats   NUM   how do i recognize which file format i have  and what do i do about it   NUM   what about arithmetic coding   NUM   does loss accumulate with repeated compression decompression   NUM   what are some rules of thumb for converting gif images

they looked unto him  and were lightened   and their faces were not ashamed 
they looked unto him  and were lightened   and their faces were not ashamed 
i m afraid i was not able to find the gifs    is the list  updated weekly  perhaps  or am i just missing something 
i m afraid i was not able to find the gifs    is the list  updated weekly  perhaps  or am i just missing something 
i guess i m delving into a religious language area   what exactly is morality  or morals   i never thought of eating meat to be moral or immoral  but i think it could be   how do we differentiate between not doing something because it is a personal choice or preference and not doing something because we see it as  immoral   do we fall to what the basis of these morals are   also  consensus positions fall to a might makes right   or  as you brought out  if whatever is right is what is societally mandated then whoever is in control at the time makes what is right  mc mac                                       

archive name  atheism resources alt atheism archive name  resources last modified  NUM april NUM version  NUM NUM                                atheist resources                        addresses of atheist organizations                                       usa  freedom from religion foundation  darwin fish bumper stickers and assorted other atheist paraphernalia are available from the freedom from religion foundation in the us   write to   ffrf  p o  box NUM  madison  wi NUM  telephone   NUM  NUM NUM  evolution designs  evolution designs sell the  darwin fish    it s a fish symbol  like the ones christians stick on their cars  but with feet and the word  darwin  written inside   the deluxe moulded NUMd plastic fish is  NUM NUM postpaid in the us   write to   evolution designs  NUM laurel canyon  NUM  north hollywood             ca NUM   people in the san francisco bay area can get darwin fish from lynn gold    try mailing  figmo netcom com    for net people who go to lynn directly  t

in article  NUMaprNUM NUM NUM organpipe uug arizona edu        hmm  it seems that this is the core of christianity then  you    have to feel guilty  and then there s this single personality   that will save you from this universal guilt feeling       brian  i will tell you a secret  i don t feel guilty at all    i do mistakes  and i regret them  however i ve never had this   huge guilt feeling hanging over my shoulder   i will tell you another secret   i get this burning sensation in my hand every time i hold it over a candle   the pain does not fill my entire body  and i m told the longer i hold it here  the less it ll hurt  it ll eventually burn up the nerves  or so i m told    so i suppose i should just ignore the pain  because holding my hand over the candle is something i just want to do  i ve got the right  don t i   your body feels pain to let you know something is wrong   it s your body s alarm system informing you that something needs your attention   a fever tells you that yo

 reply to frank dNUMsNUM uucp  frank o dwyer         if the good is undefined  undefinable   but you require of everyone that they know innately what is right  you are back to subjectivism      ditto here   an evaluative statement implies a value judgement on the part of the person making it      pretty perceptive  that prof  flew        please explain how this helps   i don t see your argument        this makes no sense either   flew is arguing that this is where the objectivist winds up  not the subjectivist   furthermore  the nihilists believed in nothing  except  science  materialism  revolution  and the people        and also not the position of the subjectivist  as has been pointed out to you already by others   ditch the strawman  already  and see my reply to mike cobb s root message in the thread societal basis for morality 
 reply to frank dNUMsNUM uucp  frank o dwyer         if the good is undefined  undefinable   but you require of everyone that they know innately what is ri

free energy technology                        by robert e  mcelwaine  physicist                           ninety to a hundred years ago  everybody  knew  that a            heavier than air machine could not possibly fly   it would            violate the  laws  of physics   all of the  experts  and             authorities  said so                             for example  simon newcomb declared in NUM    the            demonstration that no possible combination of known            substances  known forms of machinery and known forms of            force  can be united in a practical machine by which man            shall fly long distances through the air  seems to the writer            as complete as it is possible for the demonstration of any            physical fact to be                              fortunately  a few smart people such as the wright            brothers did not accept such pronouncements as the final            word   now we take airplanes for granted   except when they

rapture   october NUM  NUM    what to do in case you miss the rapture  i  stay calm and do not panic   your natural reaction once you realize what has just occurred is to panic   but to do so is absolutely useless now   if you had wanted to get right with god before the rapture  you could have  but you chose to wait   now your  only chance is to stay on this earth and to endure to the end of the  tribulation    but he that shall endure unto the end  the same shall be  saved     matthew NUM NUM  ii  realize you are now living during the great tribulation   the great tribulation is a seven year period starting from the time of the rapture until christ s second coming   also know as  the time of jacob s  israel s  trouble   jere NUM NUM  and  daniel s seventieth week   dan NUM   this  period will be unparalleled in trouble and horror   iii  gather as many bibles as you can and hide them   soon after the antichrist becomes the leader of the european community   the revived roman empire   b

brian ceccarelli presents us with the fallacy of false dichotomy in stating that we must accept every thing in the books attributed to peter  or we must discount every other book of antiquity    mr ceccarelli  you seem to be stating that we must accept accept everything written in every  historical  document   somehow i doubt do that yourself that   thus since i doubt you accept everything written in every historical document  i would ask how you can thereby objectively justify complete faith in the words of the books attributed to peter    i shall now give an example of a document from antiquity  which i am sure you reject  it dates from the time of ramses ii  this was first presented here by matthew wiener    these inscriptions were carved soon after a battle  and were carved with the pharoah s specific approval so we have true originals  rather than mere copies   this account records the the battle of kadesh  circa NUM bc   which occurred on the river orontes   about NUM miles south

 in the part of the posting you have so helpfully deleted  i  pointed out that they used the wording from the english bill of rights apparently  changing  what they understood by it  and i asked why then should we  two hundred years later  be bound by what keith allan schneider  thinks  they understood by it    so one cannot say  a cruel fate    your prevarications are getting increasingly unconvincing  i think 
 in the part of the posting you have so helpfully deleted  i  pointed out that they used the wording from the english bill of rights apparently  changing  what they understood by it  and i asked why then should we  two hundred years later  be bound by what keith allan schneider  thinks  they understood by it    so one cannot say  a cruel fate    your prevarications are getting increasingly unconvincing  i think 
ron miller is a space artist with a long and distinguished career    i ve admired both his paintings  remember the usps solar system exploration stamps last year   and 

let s make a deal    if you re going to put up a billion  i d want to budget the whole sheebang for  NUM NUM million   if i have that much money to throw around in the first place  you betcha i m going to sign a contract committing to volume production     
let s make a deal    if you re going to put up a billion  i d want to budget the whole sheebang for  NUM NUM million   if i have that much money to throw around in the first place  you betcha i m going to sign a contract committing to volume production     
i heard a friend who just return from nab from las vegas confirm that realsoft will be releasing a windows version of real NUMd NUM NUM this summer   he was told that the rendering speed on the dxNUM isn t as fast as aNUM   however  he was also told that they are switching from microsoft c   to watcom to gain more speed   for people who is looking for a powerful NUMd animation software for pc   the wait shouldn t be too long   real NUMd NUM NUM is absolutely the most powerful and

 first and foremost  i honestly do not believe that jesus was anything more than a man who lived and died two thousand years ago   i know your bible provides wonderful stories of the things he said and did  but i simply do not believe that he still exists as an entity that has any bearing on this universe or the lives in it  and i similarly do not believe that the god that you worship exists or has ever existed   period   i view religion in general and christianity in specific as a  cultural virus  that has been passed down from generation to generation because people are often too afraid to think for themselves and claim responsibility for their own fate  so they brainwash themselves and their children into believing the popular myths  and it goes on from there   and eventually christianity becomes a given    if so many other people believe in it  it must be right  no   i don t believe in any  life after death    i believe that when i die  i die  so therefore it s up to me to try to b

think about what you are saying here  the NUM bit image is quantised down to NUM bits so many  similar  colours are mapped onto a single palette colour  this colour gets modified in fairly arbitrary ways  you then want to apply these modifications back to the NUM bit file  so you have to find which colours mapped to this one palette colour  ok you could do this by copying the NUM bit file to a NUM bit file and using the extra NUM bits to hold the index entry   having done this  you need to do something to them     what  exactly   apply the difference in rgb between the original and modified palette entry to each colour in the group  this could generate colours with rgb outside the range NUM   NUM  it would also lead to discontinuities when different parts of a smooth colour gradient mapped to several different palette entries   you could interpolate from full modification to no modification depending how far each colour was from the palette entry  however i suspect this would look rath

noting that a particular society  in this case the mainland uk       has few religously motivated murders  and few murders of  any       kind  says very little about whether inter religion murders elsewhere      are religiously motivated           no  but it allows one to conclude that there is nothing inherent     in all religion  or for that matter  in catholicism and protestantism      that motivates one to kill      motivates  or  allows      the christian bible says that one may kill  under certain circumstances    in fact  it instructs one to kill under  certain circumstances        i d say the majority of people have a moral system that instructs them to kill under certain circumstances   i do get your distinction between motivate and allow  and i do agree that if a flavour of theism  allows  atoricities  then that s an indictment of that theism   but it rather depends on what the  certain circumstances  are   when you talk about christianity  or islam  then at least your claims

you should have heard prof  mcnally   from my days as an astronomy undergraduate  denouncing photon pollution  it was easy to imagine him taking practical steps to modify the sodium lamps on the street outside mill hill observatory with a NUM gauge shotgun      however  seriously  it is possible to limit the effects of streetlights  by adding a reflector  so that the light only illuminates the ground  which is after all where you need it  as a bonus  the power consumption required for a given illumination level is reduced  strangely enough  astronomers often seek to lobby elected local authorities to use such lighting systems  with considerable success in the desert areas around the major us observatories  at least  thats what mcnally told us  all those years ago    british local authorities couldn t care less  as far as i can see    i suppose that the  right  to dark skies is no more than an aspiration  but it is a worthwhile one  illuminated orbital billboards seem especially yukky  

stuff deleted      thank you   i thought i was in the twilight zone for a moment  it still amazes me that many people with science backgrounds  still confuse the models and observables with what even they would call the real world    jim halat                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     

i am not sure that i am supposed to post this mail here  however during the last year  while i was involved in developing graphical user interface  gui  applications  i have enjoyed being personally part of this news group wherin i got some interesting information which helped me in my work  i am posting my resuming hoping that people working in my area would make time to look at it  ________________________________________________________________________________ _       NUMa westgate hall        isu  ames  ia NUM         NUM  NUM NUM       april NUM  NUM   dear prospective employer   i am seeking employment as a software engineer with interests in software design and development  in which i can utilize my experience in hardware   c   c   programming  graphical user interface  gui   operating systems and computer networking   i received my bachelors of engineering  be  degree in electronics engineering in NUM and a m s  degree in electrical engineering in dec NUM  from iowa state unive

a user on my bbs  accidentally  deleted his vga driver for his oakNUM card and has no backup   i was wondering if someone knew of an ftp site  and path  please   where such a thing might be obtained   thanks  
existance                                                                                   i just thought it necessary to help defend the point that jesus  existed   guys  jesus existed   if he didnt  then you have to say that  socrates didnt exist cuz he  like jesus  has nothing from his hands that  have survived   only plato and others record his existance   many others  record jesus  existance  including the babylonian talmud   sorry guys  the  argument that jesus may not have existed is a dead point now   he did    whether he was god or whether there is a god is a completely different  story  however                                                                                 
existance                                                                                   i ju

 wow  you got me thinking now   this is an interesting question in that recently there has been a  move in society to classify previously  socially unacceptable  yet legal  activities as ok   in the past it seems to me there were always two  coexisting methods of social control   first  and most explicit  is legal control   that is the set of  actions we define as currently illegal and having a specifically defined set  of punishments   secondly  and somewhat more hidden  is social control   these are  the actions which are considered socially unacceptable and while not covered  by legal control  are scrictly controled by social censure  ideally  if  socialization is working as it should  legal control is hardly ever needed  since most people voluntarilly control their actions due to the pressure of  social censure   the control manifests itself in day to day life as  guilt  and   morality    i ve heard it said  and fully believe  that if it weren t for  the vast majority of people pol

within a few months  i ll be looking for a job in NUMd computer graphics software   i m in need of info on companies that do it   there s nothing in any of the faq s for this group  and nothing at siggraph org  at least i couldn t find anything    the last computer graphics career handbook was dated NUM  had info on NUM companies  but nothing specific on any of them   can people please direct me towards more current and detailed sources of information   i ll post a summary of sources if there s interest   also  could you please e mail me  our news server is on the fritz      thanks  brandon 
within a few months  i ll be looking for a job in NUMd computer graphics software   i m in need of info on companies that do it   there s nothing in any of the faq s for this group  and nothing at siggraph org  at least i couldn t find anything    the last computer graphics career handbook was dated NUM  had info on NUM companies  but nothing specific on any of them   can people please direct me to

about my reply    it a society that is constantly on the verge of flaming  usenet  diplomacy is the best way to ensure the voice of reason gets through  isn t it    kevin  unfortunately you are now delving into field i know too little about  algorithms  your reasoning  as i see it  is very much along the lines of roger penrose  who claimed that mathematical  insight  cannot be algorithmic in his book _the emperor s new mind  concerning computers  minds  and the laws of physics_  however  penrose s claim that he _has_ mathematical insight  or your similar claim that wavefunctions collapse only when we consciously take a look  could be just illusions   we are obviouslu taking very different viewpoints   i try to ponder on the problem of consciousness from an evolutionary perspective  realising that it might not be anything special  but certainly useful  thinking back of what i wrote  do you think worms have minds or not  they are able to experience pain  at least they behave  just like t

i had to turn to one of my problem sets that i did in class for this little problem   i don t have a calculator  but i do have the problem set that we did not too long ago  so i ll use that  and hope it s what you wanted    this is a highly simplified problem  with a very simple burst   bursts are usually more complex than this example i will use here          our burst has a peak flux of NUM NUMe NUM ergs cm  NUM sec  NUM and a duration of NUM NUM seconds   during the frst second of the burst  and the last NUM seconds  its flux is half of the peak flux   it s flux is the peak flux the rest of the time   assume that the background flux is NUMe NUM erg cm  NUM sec  NUM          then we had to find the integrated luminosity of the burst  for several different spheres  r  NUMpc oort cloud radius   r NUM NUMpc at the edge of the galaxy   r NUM NUMpc or the edge of the galactic corona  and lastly at a r NUMmpc            we integrated the flux over all time to find the fluence  then used th

 in article  NUMrNUMtqo ook horus ap mchp sni de              theism is strongly correlated with irrational belief in absolutes  irrational          belief in absolutes is strongly correlated with fanatism          deletion           theism is correlated with fanaticism  i have neither said that all fanatism      is caused by theism nor that all theism leads to fanatism  the point is       theism increases the chance of becoming a fanatic  one could of course      argue that would be fanatics tend towards theism  for example   but i just      have to loook at the times in history when theism was the dominant ideology      to invalidate that conclusion that that is the basic mechanism behind it           imo  the influence of stalin  or for that matter  ayn rand  invalidates your     assumption that theism is the factor to be considered         bogus  i just said that theism is not the only factor for fanatism     the point is that theism is  a  factor       that s your claim  now back 

this presupposes that no supersonic ramjet aircraft spacecraft can be reliable or low cost   this is unproven 
recently  i ve asked myself a rather interesting question  what right does god have on our lives  always assuming there is a god  of course          in his infinite wisdom  he made it perfectly clear that if we don t live according to his rules  we will burn in hell  well  with what right can god make that desicion  let s say  for the sake of argument  that god creates every one of us  directly or indirectly  it doesn t matter    what then happens  is that he first creates us  and then turns us lose  well  i didn t ask to be created    let s make an analogue  if a scientist creates a unique living creature  which has happened  it was even patented         does he then have the right to expect it to behave in a certain matter  or die      who is god to impose its rules on us   who can tell if god is really so righteous as god likes us to believe  are all christians a flock of s

i hope someone can help me with the following problem   i m sure there must be a known solution   given a rectangle defined by   x    x    x  and  y    y    y  where x and y are constant  and a parallelogram defined by   cNUM    a x   b y    cNUM  and  cNUM    c x   d y    cNUM  where cNUM  cNUM  a  b  c  d are constants and b a    d c  i e  not parallel lines                                                                              not equal to  what is the area of their intersection      what i m after is some general algorithm suitable for all rectangles and parallelograms that can be described by the above equations   at the  moment it looks like i m going to have to look at all possible cases  and examine each seperately e g   NUM  rectangle encloses parallelogram    NUM  parallelogram encloses rectangle   NUM  two corners of parallelogram inside rectangle                                                                                                                            

a s t r o   f t p   l i s t                               updated NUM NUM NUM     this  is  a  short  description  of  anonymous ftp  file  servers  containing   astronomy  and space research related material   i have  included only  those   servers  where  there are  special subdirectories  for  astro stuff  or  much   material  included into  a general  directories   this list is not a complete   data set of possible places   so i would be very happy of all kind of notices   and information depending on this listing      the newest version of this file is available via anonymous ftp as                     nic funet fi  pub astro general astroftp txt                                            there are also many mirror  copy  archives  for  simtel NUM army mil  pc   and    sumex aim stanford edu  mac  which are not included into this list  only some   of mirroring sites are listed                                                     veikko makela                                        

how come noone mentions eric hoffer when talking about  fanatic behavior anymore   
how come noone mentions eric hoffer when talking about  fanatic behavior anymore   
i m not sure that you can distinguish between myth and legend so neatly  or at all   a myth is more than a single story   the thought  structure and world paradigm in which that story is interpreted is as important a part of the myth as the story itself   thus  i can think of no story which is meant to be conveyed understandably from one person to another within a single culture which won t rest upon that underlying thought structure  and thus transmit some of that culture s mythical  truths  along with it   
i m not sure that you can distinguish between myth and legend so neatly  or at all   a myth is more than a single story   the thought  structure and world paradigm in which that story is interpreted is as important a part of the myth as the story itself   thus  i can think of no story which is meant to be conveyed u

 book review                                    the universe of motion   by dewey b  larson  NUM  north            pacific publishers  portland  oregon  NUM pages  indexed             hardcover                               the universe of motion  contains final solutions to            most all astrophysical mysteries                             this book is volume iii of a revised and enlarged            edition of  the structure of the physical universe   NUM              volume i is  nothing but motion   NUM   and volume ii is             the basic properties of matter   NUM                              most books and journal articles on the subject of            astrophysics are bristling with integrals  partial            differentials  and other fancy mathematics   in this book  by            contrast  mathematics is conspicuous by its absence  except            for some relatively simple formulas imbedded in the text              larson emphasizes concepts and declares that math

reply to frank dNUMsNUM uucp  frank o dwyer         the problem for the objectivist is to determine the status of moral    truths and the method by which they can be established   if we accept    that such judgements are not reports of what is but only relate to    what ought to be  see naturalistic fallacy  then they cannot be proved    by any facts about the nature of the world       this can be avoided in at least two ways   NUM  by leaving the good   undefined  since anyone who claims that they do not know what it is is   either lying or so out of touch with humanity as to be undeserving of a   reply      if the good is undefined  undefinable   but you require of everyone that  they know innately what is right  you are back to subjectivism   no  and begging the question   see below      NUM  by defining the good solely in terms of evaluative terms      ditto here   an evaluative statement implies a value judgement on the  part of the person making it   again  incorrect  and questio

and from whence does this right stem  that it overrides the  rights  of the rest of us    and if you want to view that television station  you have to watch the commercials   you can t turn them off and still be viewing the television station   in other words  if you don t like what you see  don t look   there is no  right  i can think of that you have to force other people to conform to your idea of aesthetic behaviour   what s next  laws regulating how people must dress and look so as to appeal to your fashion sense  since you have this  right  of an aesthetic view       which has what to do with the topic of discussion     oh  i see   you don t want any legislation that might impinge on you  you just want everyone else on the planet to do what you want        insisting on perfect safety is for people who don t have the balls to live  in the real world        mary shafer  nasa ames dryden
i propose that pepsico  mcdonalds and other companies could put  into orbit banners that have ti

batse alone isn t always used to determine position   when a particularly bright burst occurs  there are a couple of other detectors that catch it going off   pioneer NUM or NUM is the one i m getting at here   this puppy is far enough away  that if a bright burst happens nearby  the huge annulus created by it will hopefully intersect the line or general circle given by batse  and we can get a moderately accurate position  say oh  NUM or NUM degrees  that is the closest anyone has ever gotten with it            actually  my advisor  another classmate of mine  and me were talking the other day about putting just one detector on one of the pluto satellites   then we realized that the satellite alone is only carrying something like NUM pounds of eq   well  a batse detector needs lead shielding to protect it  and NUM alone weighs about NUM pounds itself 
i ve had pretty good success autotracing line art with adobe streamline NUM NUM  the key to controlling excessive points  etc  is to take

so in conclusion it can be shown that there is essentially no     logical argument which clearly differentiates a  cult  from a      religion    i challenge anyone to produce a distinction which     is clear and can t be easily knocked down       how about this one  a religion is a cult which has stood the test   of time   just like history is written by the  winners  and not the  losers   from what i ve seen of religions  a religion is just a cult that was so vile and corrupt it was able to exert it s doctrine using political and military measures   perhaps if koresh withstood the onslaught for another couple of months he would have started  attracting more converts due to his  strength   hence becoming a full religion and not just a cult  
ok it is for a game that is NUMd and you have listed the characteristics  that you are looking for  i think you may have left out a few  important parameters   the polygons are all convex   they have less than n sides   you are drawing meshes walls

yesterday  i went to the boeing shareholders meeting   it was a bit shorter than i expected   last year  when the stock was first down   they made a big presentation on the NUM  and other programs   this year  it was much more bare bones   in any case  i wanted to ask a question that the board of directors would hear  and so i got there early  and figured that if i didn t get to the mike  maybe they would read mine off of a card  and so i wrote it down  and handed it in   after the meeting started  mr  shrontz said that he would only answer written questions  in order to be fair to the people in the overflow room that only had monitors downstairs   naturally  i was crushed   so  when question and answer time came  i was suprised to find my question being read and answered   admittedly near the end of the ones that he took  presumably getting there early  and getting the question in early made all the difference   so  on to the substance  the question was   is boeing looking at anything

 then we all live happily ever after       seriously  if we all agreed on the circumstances we re in  i suspect we d all agree on the best course of action   unfortunately  i have no confidence that such a situation will ever arise    some of us think there s a big god in the sky  some don t   some think they ve been chosen by god  others disagree   some think they are infallible  others think otherwise   until those disagreements over circumstances can be ironed out  there s little hope of everyone agreeing    yes   i think that  for example  only a vanishingly small number of people would hold that there s a frame of reference in which gassing six million jews is good   so that s probably about as close to an objective moral value as i ve encountered in my life so far    well  i think your example s poor   if the bomb s in iraq  for example  and was dropped by an american plane  many people would hold that it was a moral act    hmm   so these moral values have a perceptible physical 

the first and only thing i ve ever tried to auto trace was a piece of a uscg nautical chart using adobe illustrator NUM NUM   i wanted to get the outline of the coast for western long island sound   i was simultaneously suprised at how good a job it did and disappointed at how poorly it did   i suspect what i gave it was a very difficult thing  not only is the coastline very irregular  but overlaid on the chart are numerous sets of gridlines  not only lattitude and longitude  but loran grids as well    the most common mistake it make was whenever the coastline was roughly parallel and tangent to a grid line  it would take off following the gridline instead of the coast   i think the best improvement would be some sort of interactive algorithm that would let you step in and say  no  dummy  you re going the wrong way     steve reisberg  a friend of mine a few years back     did his doctoral work analysing electron micrographs of filimentous phage  virii   a good chunk of the work was wri

the point about its being real or not is that one does not waste time with what reality might be when one wants predictions  the questions if the atoms are there or if something else is there making measurements indicate atoms is not necessary in such a system    and one does not have to write a new theory of existence everytime new models are used in physics 
did i not hear that there maybe some ports of realNUMd versionNUM     in the pipeline somewhere  possibly unix  not too sure though         please put me straight 
did i not hear that there maybe some ports of realNUMd versionNUM     in the pipeline somewhere  possibly unix  not too sure though         please put me straight 
 NUM  well  mr koresh allowed other children and adults to leave the compound during the course of the siege  why didnt these children leave then  i dont know myself  and certainly havent heard any answers on this here   NUM  yes  one simple non action  ie not attacking the compound with modified tanks  woul

deleted     deleted      evolution is both fact and theory   the theory of evolution represents the scientific attempt to explain the fact of evolution   the theory of evolution does not provide facts  it explains facts   it can be safely assumed that all scientific theories neither provide nor become facts but rather explain facts  i recommend that you do some appropriate reading in general science   a good starting point with regard to evolution for the layman would be  evolution as fact and theory  in  hen s teeth and horse s toes   pp NUM NUM  by stephen jay gould   there is a great deal of other useful information in this publication  
hello  i have posted to this newsgroup once before and recieved a moderately helpful response on a couple of issues  this i appreciated very much   i would however like to know why it is that ther is simply no  information out there on some subjects for the relativly novice graphics programmer  the subjects are   NUM  how do you access the extra vid

in  NUMaprNUM NUM NUM midway uchicago edu  eebNUM quads uchicago edu   i already responded to this on one dimension  but afterthoughts cause me to make another  independent reply   the problem with a  moment of silence  is that it is not an even handed way of  allowing  for religion amongst students in the public schools   as i noted before  muslims need more than a moment of silence in order to perform the prayers they are required by muhammad to do   and  at least orthodox  jewish prayer also has requirements that are not addressed by this   there is  in fact  a highly selective bias towards christian prayer in this  moment of silence  shit   and that is especially bizarre in that christian prayer doesn t need this stuff    a christain may pray totally incognito at any time  to some extent  this is true of muslims and jews as well    what i intend in my first paragraph is that there  are  some characteristic forms of prayer in  these  religions which do need special times and or beha

this is a heavily edited modified version of the gopher faq intended to give people just starting with gopher enough information to get a client and jump into gopher space   a complete version can be obtained as described below   once you have a gopher client point it at  merlot welch jhu edu and welcome to gopher space    dan jacobson  danj welchgate welch jhu edu         common questions and answers about the internet gopher  a client server protocol for making a world wide information service  with many implementations   posted to comp infosystems gopher   comp answers  and news answers every two weeks   the most recent version of this faq can be gotten through gopher  or via anonymous ftp   rtfm mit edu  pub usenet news answers gopher faq  those without ftp access should send e mail to mail server rtfm mit edu with  send usenet news answers finding sources  in the body to find out how to do ftp by e mail                                                                        list of

once again  this posting has been delayed for about a week by falling between some software cracks       here follows an introduction to the controversial incident  and an  apologetic explanation purporting to show why it couldn t actually  have happened   the historicity of the episode doesn t matter to what  follows   i don t know whether i m quoting gregg or zakaria below   anyway  back to current affairs    among others  this incident is not something rushdie or watt or anyone else dug up from nowhere  it is a well known story  a myth if you will  known  according to umar khan  to  every muslim school boy and girl   and so presumably to rushdie  and to gibreel farishta    yes  this is what writing fiction is all about   rushdie was writing about a crisis of faith  and chose this myth to present it  by placing the actor  gibreel  in the role of the angel whose name he took  rushdie was not writing a history or theology book  and nowhere claims or implies that this is what actually h

i think you are mistaken in thinking tom scharle to be a atheist  you will find both atheists and christians among your opponents on t o  calling your opponents them  branch athiests zealots  does nothing for your credibility    oh yes  do     dear me  this is taken _so_ out of context that it s hard to know where to start    the quote starts with material from p NUM  and ends with material from page NUM   on page NUM  there s the bit that says  the parts left out in john king s  quote  are marked by             the doubt that has infiltrated the previous  smugly confident certitude of evolutionary biology  s last twenty years  has inflamed passions  and provoked some very interesting thought and research     eldridge goes on immediately following the butchered quote   in short  evolutionary biology has entered a phase of creativity that is the hallmark of good  active science    the material that is on page NUM that is  quoted  by john king has been butchered even more severely     i 

as ben says   this re boost idea is all news to us here   do you know  something we don t   please supply a source   it would be nice for  the schedulers of observations to know where the thing is going to  be   these altitude numbers are also way off     my best source has    minimum st altitude in the pmdb is     NUM kilometers   maximum st altitude in the pmdb is     NUM kilometers   delta   st altitude in the pmdb is       NUM kilometers     pmdb is proposal management data base   used to schedule observations         could you supply some calculations   you might check some recent  postings that explained that  a small booster  as suggested does  not now exist  so comparing the mass of something that doesn t  exist to the mass of the oms fuel seems impossible   the contamination  threat also remains           longer drag life i can understand  but could you explain the  antenna pointing      tell me about it   although the arrays can be  and are  moved perfectly  well utilizing th

in article  NUMaprNUM NUM is morgan com   jlieb is morgan com  jerry liebelson  writes        i want to know what weightlessness actually feels like  for example  is    there a constant sensation of falling    ron baalke  baalke kelvin jpl nasa gov  replied    yes  weightlessness does feel like falling   it may feel strange at first    but the body does adjust   the feeling is not too different from that   of sky diving   i m no astronaut  but i ve flown in the kc NUM several times   i ll tell you about my first flight   at the on set of weightlessness  my shoulders lifted and my spine straightened   i felt a momentary panic  and my hands tried to grab onto something  like the strap keeping me firmly against the floor  to prevent me from falling  i remember conciously over ruling my involuntary motions   my ears felt  not heard  a rush and i could feel fluid moving in my head  like when you get up from bed while you have a cold    at that point  i ceased to concentrate on my physiologi

br  from  wpr atlanta dg com  bill rawlins  br  newsgroups  alt atheism br  organization  dgsid  atlanta  ga  br         since you have referred to the messiah  i assume you br  are referring         to the new testament   please detail br  your complaints or e mail if         you don t want to post  br   first century greek is well known and         br  well understood   have you considered josephus  the jewish br  historian          who also wrote of jesus   in addition  br  the four gospel accounts   are very much in harmony   it is also well known that the comments in josephus relating to jesus were inserted  badly  by later editors   as for the four gospels being in harmony on the issue of jesus     you know not of what you speak   here are a few contradictions starting with the trial and continuing through the assension    acts NUM NUM   now this man  judas  purchased a field with the reward of  iniquity  and falling headlong  he burst asunder in the midst  and all his  bowels gu

fc  exactly what fraction of current research is done on the big   fc  visable light telescopes  from what i ve seen  NUM  or less  fc   down from amlost NUM  NUM years ago   that sounds like  dying  fc  to me      that doesn t seem like a fair comparison   infrared astronomy   didn t really get started until something like NUM yrs  ago  it  didn t explode until iras in NUM   gamma ray  and i think   x ray  observations didn t really get started until the  NUMs   i believe the same is true of ultraviolet observations in   general  and i know that extreme uv  short of NUM angstroms   observations  until the euve  launched last year  had almost   no history except a few observations on skylab in the  NUMs    twenty five years ago  the vast majority of astronomers only   had access to optical or radio instruments   now  with far more  instruments available  growth in some of these new fields has  resulted in optical work representing a smaller fraction of   total astronomical work     fc 

the following partial summary of a theory of the universe includes a little known description of the creation of our solar system                          larsonian astronomy and physics                 orthodox physicists  astronomers  and astrophysicists            claim to be looking for a  unified field theory  in which all            of the forces of the universe can be explained with a single            set of laws or equations   but they have been systematically            ignoring or suppressing an excellent one for NUM years                   the late physicist dewey b  larson s comprehensive            general unified theory of the physical universe  which he            calls the  reciprocal system   is built on two fundamental            postulates about the physical and mathematical natures of            space and time                        NUM   the physical universe is composed entirely of one            component  motion  existing in three dimensions  in discrete       

actually  for digital hdtv systems that s far higher bandwidth than you need  unless there s some reason you must work in fully uncompressed hdtv   also  my calculations is that each frame should be well under NUMmb  even using NUM bits pixel  which is more bits than you actually need   NUM or NUM should be enough for a moving picture    NUMxNUMxNUMbits is NUM NUMmb  i m guessing at hdtv resolution   it may be a bit wider than NUM  i m fairly sure of the NUM number for most of the digital proposals     i hope you have a very fast memory system as well   NUMmb s while displaying will require a heavily interleaved vram system    unless you have a _very_ compelling reason  i d advise trying to use at least somewhat compressed data   you don t have to go to full compression to get to a level where the data io requirements are much cheaper and easier to deal with       gnu emacs is a lisp operating system disguised as a word processor     doug mohney  in comp arch
well  almost  it turns out

first  i don t expect them to love me if they don t even know i   exist   secondly  i wouldn t expect them to love me simply because   they were my creator   i would expect to have to earn that love      are you daft   how do i love something i don t believe exists    come back when you ve learned to love your third testicle      at which point you have stepped over the line and become a   complete asshole   even though it s your first offense  i won t   let it slip becuase i ve heard it too goddamned many times     you love jesus because deep in your heart you re a cannibalistic   necrophiliac   because i say so  and i m much more qualified to   assess your motivations than you are     fortunately  there are some things i get to accept on evidence   rather than faith   one of them being that until christians like   yourself quit being so fucking arrogant  there will never be   peace   you ve all made sure of that                                                                         

in article  NUMmayNUM NUM NUM hal com   bobp hal com  bob pendleton  writes       from article  NUMmayNUM NUM NUM pony ingres com   by mwmeyer ingres com  mike  wading through the muck and  meyer           this is getting pretty silly  first off   hacker  is an obsolete term          doesn t matter what it used to mean  today it means  thief              it only means  thief  if you want it to mean that   to me  it means       lots of context wickedly omitted by myself               anyway  if i say  joe is a hacker  to most english speaking people who    know the word they ll probably think he is either a poor golfer or a    bad carpenter  but there are very very few people who will think he is    a good and clever programmer              if you chose to call yourself by a term that means  thief  don t be    surprised when people think you are a thief  even if you don t agree    with that definition of the word                                     the narrower view that a hacker  when 

from article  NUMsuntv NUMkm watson mtsu edu   by csjohn watson mtsu edu  john wallace    the gl file is an archive containing individual frames or pieces of frames  usually stored as  pic or  clp files   fonts  and a  txt file that tells the grasp animation system how to display it   gl stands for grasp library   there is probably a detailed discussion of this subject in the alt binaries pictures faq   there are freely distributable viewers for gl files  and they are usually named grasprt  exe  replace the   with a version digit or letter    most gl files contain frames that are hardware specific to particular modes of the cga  ega  or vga adapters on pcs   i think that there are some copies of grasprt available by anonymous ftp  i know that i got one there a long time ago      good luck   jack 
there are a number of philosophical questions that i would like to ask   NUM   if we encounter a life form during our space exploration  how do we determine if we should capture it  imprison i

stuff deleted  i reiterate that i would agree with you that there is little justification for the punishment of apostasy in the qur an  in islamic history  as well  apostasy has rarely been punished   belief is considered a matter of conscience and since there is to be no compulsion in the matter of belief  apostates have been generally left to believe or not believe as they will   however  when an apostate makes attacks upon  god and his messenger  the situation changes  now the charge of apostasy may be complicated with other charges   perhaps  charges of sedition  treason  spying  etc  if the person  makes a public issue of their apostasy or mounts public attacks  as opposed to arguement  against islam  the situation is likewise complicated  if the person spreads slander or broadcasts falsehoods  again the situation changes  the punishments vary according to the situation the apostate is in  anyhow  the charge of aggravated apostasy would only be a subsidiary charge in rushdie s cas

 in article  NUMrNUMsnNUM NUMr horus ap mchp sni de            theism is strongly correlated with irrational belief in absolutes  irrational        belief in absolutes is strongly correlated with fanatism       deletion        theism is correlated with fanaticism  i have neither said that all fanatism    is caused by theism nor that all theism leads to fanatism  the point is     theism increases the chance of becoming a fanatic  one could of course    argue that would be fanatics tend towards theism  for example   but i just    have to loook at the times in history when theism was the dominant ideology    to invalidate that conclusion that that is the basic mechanism behind it       imo  the influence of stalin  or for that matter  ayn rand  invalidates your   assumption that theism is the factor to be considered      bogus  i just said that theism is not the only factor for fanatism   the point is that theism is  a  factor   that s your claim  now back it up   i consider your argument

technion   israel institute of technology          department of computer science         graduate studies in computer graphics  applications are invited for graduate students wishing to specialize in computer graphics and related fields  active research is being conducted in the fields of image rendering  geometric modelling and computer animation  state of the art graphics workstations  sun  silicon graphics  and video equipment are available  the technion offers full scholarship support  tuition and  assistantships  for suitable candidates   for more information contact
hi    sorry if this is a faq  but are there any conversion utilities available for autodesk   dxf to amiga   iff format   i checked the comp graphics faq and a number of sites  but so far no banana   please e mail   thanks 
hi    sorry if this is a faq  but are there any conversion utilities available for autodesk   dxf to amiga   iff format   i checked the comp graphics faq and a number of sites  but so far no banan

  sample_weight=sample_weight)


In [357]:
test_case = train_data[9:11]
test_case = np.array(test_case)
print test_case

[ u"I'm interested in find out what is involved in processing pairs of \nstereo photographs.  I have black-and-white photos and would like \nto obtain surface contours.\n\nI'd prefer to do the processing on an SGI, but would be interested\nin hearing what software/hardware is used for this type of\nimage processing.\n\nPlease email and/or post to comp.sys.sgi.graphics your responses.\n\nThanks,"
 u"a\n\nWhat about positional uncertainties in S-L 1993e?   I assume we know where\nand what Galileo is doing within a few meters.   But without the\nHGA,  don't we have to have some pretty good ideas, of where to look\nbefore imaging?  If the HGA was working,  they could slew around\nin near real time (Less speed of light delay).  But when they were\nimaging toutatis????  didn't someone have to get lucky on a guess to\nfind the first images?   \n\nAlso, I imagine S-L 1993e will be mostly a visual image.  so how will\nthat affect the other imaging missions.  with the LGA,  there is a real\ntigh

In [361]:
#removes paragraphs
print type(test_case)
cleaner_entry = [(re.sub(r'[^\w]', ' ', item)) for item in test_case]
print cleaner_entry

<type 'numpy.ndarray'>
[u'I m interested in find out what is involved in processing pairs of  stereo photographs   I have black and white photos and would like  to obtain surface contours   I d prefer to do the processing on an SGI  but would be interested in hearing what software hardware is used for this type of image processing   Please email and or post to comp sys sgi graphics your responses   Thanks ', u'a  What about positional uncertainties in S L 1993e    I assume we know where and what Galileo is doing within a few meters    But without the HGA   don t we have to have some pretty good ideas  of where to look before imaging   If the HGA was working   they could slew around in near real time  Less speed of light delay    But when they were imaging toutatis      didn t someone have to get lucky on a guess to find the first images      Also  I imagine S L 1993e will be mostly a visual image   so how will that affect the other imaging missions   with the LGA   there is a real tigh

In [364]:
#removes paragraphs
print type(test_case)
[(re.sub(r'[^\w]', ' ', item)) for item in test_case]


<type 'numpy.ndarray'>


[u'I m interested in find out what is involved in processing pairs of  stereo photographs   I have black and white photos and would like  to obtain surface contours   I d prefer to do the processing on an SGI  but would be interested in hearing what software hardware is used for this type of image processing   Please email and or post to comp sys sgi graphics your responses   Thanks ',
 u'a  What about positional uncertainties in S L 1993e    I assume we know where and what Galileo is doing within a few meters    But without the HGA   don t we have to have some pretty good ideas  of where to look before imaging   If the HGA was working   they could slew around in near real time  Less speed of light delay    But when they were imaging toutatis      didn t someone have to get lucky on a guess to find the first images      Also  I imagine S L 1993e will be mostly a visual image   so how will that affect the other imaging missions   with the LGA   there is a real tight allocation of bandwi

In [365]:
test_case = test_case.lower()
print test_case

AttributeError: 'numpy.ndarray' object has no attribute 'lower'

In [363]:
#removes paragraphs
#print type(test_case)
for item in range(0,len(test_case)):
    test_case[item] = [(re.sub(r'[^\w]', ' ', test_case[item]))]
print cleaner_entry

ValueError: setting an array element with a sequence

(6) The idea of regularization is to avoid learning very large weights (which are likely to fit the training data, but not generalize well) by adding a penalty to the total size of the learned weights. That is, logistic regression seeks the set of weights that minimizes errors in the training data AND has a small size. The default regularization, L2, computes this size as the sum of the squared weights (see P3, above). L1 regularization computes this size as the sum of the absolute values of the weights. The result is that whereas L2 regularization makes all the weights relatively small, L1 regularization drives lots of the weights to 0, effectively removing unimportant features.

Train a logistic regression model using a "l1" penalty. Output the number of learned weights that are not equal to zero. How does this compare to the number of non-zero weights you get with "l2"? Now, reduce the size of the vocabulary by keeping only those features that have at least one non-zero weight and retrain a model using "l2".

Make a plot showing accuracy of the re-trained model vs. the vocabulary size you get when pruning unused features by adjusting the C parameter.

Note: The gradient descent code that trains the logistic regression model sometimes has trouble converging with extreme settings of the C parameter. Relax the convergence criteria by setting tol=.01 (the default is .0001).

[4 pts]

In [8]:
def P6():
    # Keep this random seed here to make comparison easier.
    np.random.seed(0)

    ### STUDENT START ###

    

    ### STUDENT END ###
P6()

(7) Use the TfidfVectorizer -- how is this different from the CountVectorizer? Train a logistic regression model with C=100.

Make predictions on the dev data and show the top 3 documents where the ratio R is largest, where R is:

maximum predicted probability / predicted probability of the correct label

What kinds of mistakes is the model making? Suggest a way to address one particular issue that you see.

[4 pts]

In [11]:
#def P7():
    ### STUDENT START ###



    ### STUDENT END ###
#P7()

ANSWER: TfidfVectorizer -- how is this different from the CountVectorizer? The term frequency inverse document frequency weighs the importance of the features given the length of the overall message so that longer messages don't carry additional importance in the model just because they are longer.

(8) EXTRA CREDIT

Try implementing one of your ideas based on your error analysis. Use logistic regression as your underlying model.

- [1 pt] for a reasonable attempt
- [2 pts] for improved performance