<a href="https://colab.research.google.com/github/victoriapedlar/isizulu-text-generation/blob/main/Copy_of_AWD_LSTM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Language Model Using Techniques from AWD - LSTM
This notebook contains an implementation of a Language Model using state-of-the-art techniques also seen in the current state-of-the-art AWD-LSTM

This model was built as part of a project in the course 02456 Deep Learning @ DTU - Technical University of Denmark
+ This code was originally forked from the [PyTorch word level language modeling example](https://github.com/pytorch/examples/tree/master/word_language_model) and is heavily inspired by the original AWD-LSTM implementation [LSTM and QRNN Language Model Toolkit](https://github.com/salesforce/awd-lstm-lm)

+ The code in this notebook is available on [google colab](https://colab.research.google.com/drive/1yyUGJfyYKdvPi6J7ZlsxPg9E_ppZG1xU) and on [github](https://github.com/mikkelbrusen/awd-inspired-lstm).

The model comes with instructions to train a word level language models over the Penn Treebank (PTB).

The project was carried out by [Gustav Madslund](https://github.com/gustavmadslund) and [Mikkel Møller Brusen](https://github.com/mikkelbrusen).




Below is a checklist on which components from the original AWD-LSTM we have implemented in this model:
### Core components

1.   **[x]  - Multi Layer** - We will need to controll what happens in between the layers, therefore, instead of using the multi layer cuDNN lstm implementation, we will create multiple single layer cuDNN lstms.
2.   **[x] - Weight drop** using DropConnect on hidden-hidden weights $[U^i, U^f, U^o, U^c]$ before forward and backward pass - makes it possible to use cuDNN LSTM
3.   **[x] - Optimization** using SGD and ASGD while training

### Extended regularization techniques
4.   **[ ] - Variable sequence length** to allow all elements in the dataset to experience a full BPTT window
  - **[ ] - Rescale learning rate** to counter the varible sequence lengths favoring short sequences with fixed learning rate
5.   **[x] - Variational dropout AKA LockDrop** for everything else than hidden-hidden, such that we use same dropout mask for all input/output in a forward backward pass of LSTM
6.   **[x] - Embedding dropout** which is **not** just a dropout applied on the embedding
7.   **[x]  - Weight tying** to reduce parameters and prevent model from having to learn one-to-one correspondance between input and output
8.   **[x] - Embed size** independent from hidden size, to reduce parameters.
9.   **[ ] - AR and TAR** - $L_2$-regularization by applying AR and TAR loss on the final RNN layer - can screw stuff up


# Setup
This section contains all the necessary setup as hyperparameters, data processing and utility functions

## Google Colab Setup
Since we are running on Google Colab, we will need to install PyTorch as they only support TensorFlow by default, because, well, they are Google and not Facebook.

### PyTorch 0.4.1 with CUDA 9.2 backend

In [None]:
 # http://pytorch.org/
from os.path import exists
# !pip install wheel==0.34.2
from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())
cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\.\([0-9]*\)\.\([0-9]*\)$/cu\1\2/'
accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'

# !pip uninstall torch -y
!pip install -q http://download.pytorch.org/whl/{accelerator}/torch-0.4.1-{platform}-linux_x86_64.whl torchvision
# import torch

[31m  ERROR: HTTP error 403 while getting http://download.pytorch.org/whl/cu110/torch-0.4.1-cp37-cp37m-linux_x86_64.whl[0m
[31mERROR: Could not install requirement torch==0.4.1 from http://download.pytorch.org/whl/cu110/torch-0.4.1-cp37-cp37m-linux_x86_64.whl because of HTTP error 403 Client Error: Forbidden for url: http://download.pytorch.org/whl/cu110/torch-0.4.1-cp37-cp37m-linux_x86_64.whl for URL http://download.pytorch.org/whl/cu110/torch-0.4.1-cp37-cp37m-linux_x86_64.whl[0m


We will need some data to train on, and a place to save our model. 
We connect to google drive and position our data in the following path: *MyDrive/NLP/data/penn/*

In [None]:
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

## Imports and params





In [None]:
import argparse
import time
import math
import os
import torch
import torch.nn as nn
import torch.onnx
import numpy as np
from torch.autograd import Variable
from collections import Counter

In [None]:
args_cuda = torch.cuda.is_available()
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Hyper-parameters
args_train_batch_size = 20 # batch size
args_bptt = 70 # sequence length
args_embed_size = 400 # emsize
args_hidden_size = 1150 # nhid
args_num_layers = 3 # nlayers
args_num_epochs = 750
args_learning_rate = 30
args_dropout = 0.4
args_dropouth = 0.25
args_dropouti = 0.4
args_dropoute = 0.1
args_clip = 0.25
args_log_interval = 100

# if you dont already have the penn treebank data, grab it from our github repo
# here: https://github.com/mikkelbrusen/awd-inspired-lstm
args_data = "/content/gdrive/My Drive/NLP/data/penn/"

# The file in which we want to save our trained model.
args_save = "/content/gdrive/My Drive/NLP/save/AWD_LSTM_Model.pt"

args_seed = 141
args_nonmono = 5
args_wdrop = 0.5
args_tie_weights = True

np.random.seed(args_seed)
torch.manual_seed(args_seed)

if args_cuda:
  torch.cuda.manual_seed(args_seed)

## The data loader
Dictionary and corpus to process the dataset

In [None]:
class Dictionary(object):
    def __init__(self):
        self.word2idx = {}
        self.idx2word = []
        self.counter = Counter()
        self.total = 0

    def add_word(self, word):
        if word not in self.word2idx:
            self.idx2word.append(word)
            self.word2idx[word] = len(self.idx2word) - 1
        token_id = self.word2idx[word]
        self.counter[token_id] += 1
        self.total += 1
        return self.word2idx[word]

    def __len__(self):
        return len(self.idx2word)


class Corpus(object):
    def __init__(self, path):
        self.dictionary = Dictionary()
        self.train = self.tokenize(os.path.join(path, 'train.txt'))
        self.valid = self.tokenize(os.path.join(path, 'valid.txt'))
        self.test = self.tokenize(os.path.join(path, 'test.txt'))

    def tokenize(self, path):
        """Tokenizes a text file."""
        assert os.path.exists(path)
        # Add words to the dictionary
        with open(path, 'r') as f:
            tokens = 0
            for line in f:
                words = line.split() + ['<eos>']
                tokens += len(words)
                for word in words:
                    self.dictionary.add_word(word)

        # Tokenize file content
        with open(path, 'r') as f:
            ids = torch.LongTensor(tokens)
            token = 0
            for line in f:
                words = line.split() + ['<eos>']
                for word in words:
                    ids[token] = self.dictionary.word2idx[word]
                    token += 1

        return ids

## Utils
Utility functions which will be used while training, validating and testing

In [None]:
def repackage_hidden(h):
    """Wraps hidden states in new Tensors, to detach them from their history."""
    if isinstance(h, torch.Tensor):
        return h.detach()
    else:
        return tuple(repackage_hidden(v) for v in h)


# Starting from sequential data, batchify arranges the dataset into columns.
# For instance, with the alphabet as the sequence and batch size 4, we'd get
# ┌ a g m s ┐
# │ b h n t │
# │ c i o u │
# │ d j p v │
# │ e k q w │
# └ f l r x ┘.
# These columns are treated as independent by the model, which means that the
# dependence of e. g. 'g' on 'f' can not be learned, but allows more efficient
# batch processing.

def batchify(data, bsz):
    # Work out how cleanly we can divide the dataset into bsz parts.
    nbatch = data.size(0) // bsz
    # Trim off any extra elements that wouldn't cleanly fit (remainders).
    data = data.narrow(0, 0, nbatch * bsz)
    # Evenly divide the data across the bsz batches.
    data = data.view(bsz, -1).t().contiguous()
    return data.to(device)


# get_batch subdivides the source data into chunks of length args.bptt.
# If source is equal to the example output of the batchify function, with
# a bptt-limit of 2, we'd get the following two Variables for i = 0:
# ┌ a g m s ┐ ┌ b h n t ┐
# └ b h n t ┘ └ c i o u ┘
# Note that despite the name of the function, the subdivison of data is not
# done along the batch dimension (i.e. dimension 1), since that was handled
# by the batchify function. The chunks are along dimension 0, corresponding
# to the seq_len dimension in the LSTM.
  
def get_batch(source, i):
    seq_len = min(args_bptt, len(source) - 1 - i)
    data = source[i:i+seq_len]
    target = source[i+1:i+1+seq_len].view(-1)
    return data, target
  
def model_save(fn):
    with open(fn, 'wb') as f:
        torch.save([model, criterion, optimizer], f)
        
def model_load(fn):
    global model, criterion, optimizer
    with open(fn, 'rb') as f:
        model, criterion, optimizer = torch.load(f)

## Process data
Load the dataset and make train, validaiton and test sets

In [None]:
# Load "Penn Treebank" dataset
corpus = Corpus("/content/gdrive/My Drive/NLP/data/penn/")

eval_batch_size = 10
test_batch_size = 10
train_data = batchify(corpus.train, args_train_batch_size)
val_data = batchify(corpus.valid, eval_batch_size)
test_data = batchify(corpus.test, test_batch_size)

# AWD-LSTM
ASGD Weight-Dropped LSTM -> AWD-LSTM

https://github.com/salesforce/awd-lstm-lm

and article can be found here:

https://arxiv.org/abs/1708.02182

The functions for WeightDrop, LockDrop, EmbeddingDropout have been taken from their source code.

An explanation of these functions can be found here:

https://towardsdatascience.com/learning-note-dropout-in-recurrent-networks-part-2-f209222481f8





##Embedding Dropout

In [None]:
def embedded_dropout(embed, words, dropout=0.1):
  if dropout:
    mask = embed.weight.data.new().resize_((embed.weight.size(0), 1)).bernoulli_(1 - dropout).expand_as(embed.weight) / (1 - dropout)
    masked_embed_weight = mask * embed.weight
  else:
    masked_embed_weight = embed.weight

  padding_idx = embed.padding_idx
  if padding_idx is None:
      padding_idx = -1

  X = torch.nn.functional.embedding(words, masked_embed_weight,
    padding_idx, embed.max_norm, embed.norm_type,
    embed.scale_grad_by_freq, embed.sparse
  )
  return X

##DropConnect 

In [None]:
from functools import wraps

class WeightDrop(torch.nn.Module):
    def __init__(self, module, weights, dropout=0):
        super(WeightDrop, self).__init__()
        self.module = module
        self.weights = weights
        self.dropout = dropout
        self._setup()

    def widget_demagnetizer_y2k_edition(*args, **kwargs):
        return

    def _setup(self):
        if issubclass(type(self.module), torch.nn.RNNBase):
            self.module.flatten_parameters = self.widget_demagnetizer_y2k_edition

        for name_w in self.weights:
            print('Applying weight drop of {} to {}'.format(self.dropout, name_w))
            w = getattr(self.module, name_w)
            del self.module._parameters[name_w]
            self.module.register_parameter(name_w + '_raw', nn.Parameter(w.data))

    def _setweights(self):
        for name_w in self.weights:
            raw_w = getattr(self.module, name_w + '_raw')
            w = None
            w = torch.nn.functional.dropout(raw_w, p=self.dropout, training=self.training)
            setattr(self.module, name_w, w)

    def forward(self, *args):
        self._setweights()
        return self.module.forward(*args)

##Locked Dropout

In [None]:
class LockedDropout(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, x, dropout=0.5):
        if not self.training or not dropout:
            return x
        m = x.data.new(1, x.size(1), x.size(2)).bernoulli_(1 - dropout)
        mask = Variable(m, requires_grad=False) / (1 - dropout)
        mask = mask.expand_as(x)
        return mask * x

#Create the Model
First we define our model

In [None]:
class LSTMModel(nn.Module):
    def __init__(self, num_tokens, hidden_size, embed_size, output_size, dropout=0.5, n_layers=1, wdrop=0, dropouth=0.5, dropouti=0.5, dropoute=0.1, tie_weights=False):
        super(LSTMModel, self).__init__()
        self.embed_size = embed_size
        self.hidden_size = hidden_size
        self.n_layers = n_layers
        
        self.tie_weights = tie_weights
        self.lockdrop = LockedDropout()
        self.dropouti = dropouti
        self.dropouth = dropouth
        self.dropoute = dropoute
        self.dropout = dropout
        self.encoder = nn.Embedding(num_tokens, embed_size)
        
        #init LSTM layers
        self.lstms = []
        
        for l in range(n_layers):
          layer_input_size = embed_size if l == 0 else hidden_size
          layer_output_size = hidden_size if l != n_layers-1 else (embed_size if tie_weights else hidden_size)
          self.lstms.append(nn.LSTM(layer_input_size, layer_output_size, num_layers=1, dropout=0))
        if wdrop:
          # Encapsulate lstms in DropConnect class to tap in on their forward() function and drop connections
          self.lstms = [WeightDrop(lstm, ['weight_hh_l0'], dropout=wdrop) for lstm in self.lstms]
        self.lstms = nn.ModuleList(self.lstms)
        
        self.decoder = nn.Linear(embed_size if tie_weights else hidden_size, output_size)
        
        if tie_weights:
          #Tie weights
          self.decoder.weight = self.encoder.weight
          
        self.init_weights()
       

    def init_weights(self):
        initrange = 0.1
        self.encoder.weight.data.uniform_(-initrange, initrange)
        self.decoder.bias.data.zero_()
        self.decoder.weight.data.uniform_(-initrange, initrange)

    def forward(self, inp, hidden):
      # Do embedding dropout
        emb = embedded_dropout(self.encoder, inp, dropout=self.dropoute if self.training else 0)
        # Do variational dropout
        emb = self.lockdrop(emb, self.dropouti)
        
        new_hidden = []
        outputs = []
        output = emb
        for i, lstm in enumerate(self.lstms): 
            output, new_hid = lstm(output, hidden[i])
            
            new_hidden.append(new_hid)
            if i != self.n_layers - 1:
              # Do variational dropout
              output = self.lockdrop(output, self.dropouth)
        
        hidden = new_hidden
        # Do variational dropout
        output = self.lockdrop(output, self.dropout)
   
        decoded = self.decoder(output.view(output.size(0)*output.size(1), output.size(2)))
        decoded = decoded.view(output.size(0), output.size(1), decoded.size(1))
        return decoded, hidden

    def init_hidden(self,bsz):
        weight = next(self.parameters()).data

        return [(weight.new(1, bsz, self.hidden_size if l != self.n_layers - 1 else (self.embed_size if self.tie_weights else self.hidden_size)).zero_(),
                weight.new(1, bsz, self.hidden_size if l != self.n_layers - 1 else (self.embed_size if self.tie_weights else self.hidden_size)).zero_())
                for l in range(self.n_layers)]


Then we build the model and specify our loss function

In [None]:
ntokens = len(corpus.dictionary)

model = LSTMModel(ntokens, args_hidden_size, args_embed_size, ntokens, args_dropout, args_num_layers, args_wdrop, args_dropouth, args_dropouti, args_dropoute, args_tie_weights).to(device)
criterion = nn.CrossEntropyLoss()

# Print number of parameters for comparison with other language models
params = list(model.parameters()) + list(criterion.parameters())
total_params = sum(x.size()[0] * x.size()[1] if len(x.size()) > 1 else x.size()[0] for x in params if x.size())
print('Model total parameters:', total_params)

# Train the model

First we define our training and evalutation

In [None]:
def evaluate(data_source):
    # Turn on evaluation mode which disables dropout.
    model.eval()
    total_loss = 0.
    ntokens = len(corpus.dictionary)
    hidden = model.init_hidden(test_batch_size)
    with torch.no_grad():
        for i in range(0, data_source.size(0) - 1, args_bptt):
            data, targets = get_batch(data_source, i)
            output, hidden = model(data, hidden)
            output_flat = output.view(-1, ntokens)
            total_loss += len(data) * criterion(output_flat, targets).item()
            hidden = repackage_hidden(hidden)
    return total_loss / (len(data_source) - 1)


def train():
    # Turn on training mode which enables dropout.
    model.train()
    total_loss = 0.
    start_time = time.time()
    ntokens = len(corpus.dictionary)
    hidden = model.init_hidden(args_train_batch_size)
    for batch, i in enumerate(range(0, train_data.size(0) - 1, args_bptt)):
        data, targets = get_batch(train_data, i)
        # Starting each batch, we detach the hidden state from how it was previously produced.
        # If we didn't, the model would try backpropagating all the way to start of the dataset.
        hidden = repackage_hidden(hidden)
        optimizer.zero_grad()
        output, hidden = model(data, hidden)

        loss = criterion(output.view(-1, ntokens), targets)
        loss.backward()

        # `clip_grad_norm` helps prevent the exploding gradient problem in RNNs / LSTMs.
        torch.nn.utils.clip_grad_norm_(model.parameters(), args_clip)
        optimizer.step()

        total_loss += loss.item()

        if batch % args_log_interval == 0 and batch > 0:
            cur_loss = total_loss / args_log_interval
            elapsed = time.time() - start_time
            print('| epoch {:3d} | {:5d}/{:5d} batches | lr {:02.2f} | ms/batch {:5.2f} | '
                    'loss {:5.2f} | ppl {:8.2f}'.format(
                epoch, batch, len(train_data) // args_bptt, lr,
                elapsed * 1000 / args_log_interval, cur_loss, math.exp(cur_loss)))
            total_loss = 0
            start_time = time.time()


Then do the actual training

In [None]:
# Loop over epochs.
lr = args_learning_rate
best_val_loss = 100000000
stored_losses = []

# At any point you can hit Ctrl + C to break out of training early.
try:
    optimizer = torch.optim.SGD(model.parameters(), lr=lr)
    for epoch in range(1, args_num_epochs+1):
        epoch_start_time = time.time()
        train() 
        if 't0' in optimizer.param_groups[0]:
            tmp = {}
            for prm in model.parameters():
                tmp[prm] = prm.data.clone()
                prm.data = optimizer.state[prm]['ax'].clone()

            val_loss2 = evaluate(val_data)
            print('-' * 89)
            print('| end of epoch {:3d} | time: {:5.2f}s | valid loss {:5.2f} | '
                'valid ppl {:8.2f} | valid bpc {:8.3f}'.format(
                    epoch, (time.time() - epoch_start_time), val_loss2, math.exp(val_loss2), val_loss2 / math.log(2)))
            print('-' * 89)

            if val_loss2 < best_val_loss:
                model_save(args_save)
                best_val_loss = val_loss

            for prm in model.parameters():
                prm.data = tmp[prm].clone()

        else:
            val_loss = evaluate(val_data)
            print('-' * 89)
            print('| end of epoch {:3d} | time: {:5.2f}s | valid loss {:5.2f} | '
                    'valid ppl {:8.2f}'.format(
                        epoch, (time.time() - epoch_start_time), val_loss, math.exp(val_loss)))
            print('-' * 89)
            # Save the model if the validation loss is the best we've seen so far.
            if val_loss < best_val_loss:
                model_save(args_save)
                best_val_loss = val_loss
            
            elif len(stored_losses) > args_nonmono and val_loss > min(stored_losses[:-args_nonmono]):
                print('Switching to ASGD')
                optimizer = torch.optim.ASGD(model.parameters(), lr=lr, t0=0, lambd=0.)

            stored_losses.append(val_loss)
               
except KeyboardInterrupt:
    print('-' * 89)
    print('Exiting from training early')


  result = self.forward(*input, **kwargs)


| epoch   1 |   100/  663 batches | lr 30.00 | ms/batch 279.10 | loss  7.58 | ppl  1954.87
| epoch   1 |   200/  663 batches | lr 30.00 | ms/batch 275.48 | loss  6.71 | ppl   818.22
| epoch   1 |   300/  663 batches | lr 30.00 | ms/batch 275.57 | loss  6.49 | ppl   661.46
| epoch   1 |   400/  663 batches | lr 30.00 | ms/batch 275.75 | loss  6.26 | ppl   520.77
| epoch   1 |   500/  663 batches | lr 30.00 | ms/batch 276.15 | loss  6.10 | ppl   444.75
| epoch   1 |   600/  663 batches | lr 30.00 | ms/batch 275.70 | loss  5.95 | ppl   382.56
-----------------------------------------------------------------------------------------
| end of epoch   1 | time: 192.70s | valid loss  5.74 | valid ppl   312.41
-----------------------------------------------------------------------------------------


  "type " + obj.__name__ + ". It won't be checked "
  "type " + obj.__name__ + ". It won't be checked "
  "type " + obj.__name__ + ". It won't be checked "


| epoch   2 |   100/  663 batches | lr 30.00 | ms/batch 278.65 | loss  5.87 | ppl   354.77
| epoch   2 |   200/  663 batches | lr 30.00 | ms/batch 276.11 | loss  5.75 | ppl   313.99
| epoch   2 |   300/  663 batches | lr 30.00 | ms/batch 275.69 | loss  5.71 | ppl   302.87
| epoch   2 |   400/  663 batches | lr 30.00 | ms/batch 276.12 | loss  5.60 | ppl   271.39
| epoch   2 |   500/  663 batches | lr 30.00 | ms/batch 276.14 | loss  5.55 | ppl   257.65
| epoch   2 |   600/  663 batches | lr 30.00 | ms/batch 276.14 | loss  5.47 | ppl   237.72
-----------------------------------------------------------------------------------------
| end of epoch   2 | time: 192.82s | valid loss  5.31 | valid ppl   201.54
-----------------------------------------------------------------------------------------
| epoch   3 |   100/  663 batches | lr 30.00 | ms/batch 279.06 | loss  5.50 | ppl   243.65
| epoch   3 |   200/  663 batches | lr 30.00 | ms/batch 276.11 | loss  5.43 | ppl   227.37
| epoch   3 |   3

Finally,  open the best saved model run it on the test data

In [None]:
# Load the best saved model.
model_load(args_save)

# Run on test data.
test_loss = evaluate(test_data)
print('=' * 89)
print('| End of training | test loss {:5.2f} | test ppl {:8.2f}'.format(
    test_loss, math.exp(test_loss)))
print('=' * 89)

  result = self.forward(*input, **kwargs)


| End of training | test loss  4.21 | test ppl    67.40


# Word generator

First define the arguments and load the corpus

In [None]:
model_load(args_save)

args_words = 300
args_temperature = 0.8
model.eval()

corpus = Corpus(args_data)
ntokens = len(corpus.dictionary)
hidden = model.init_hidden(1)



Then generate some data

In [None]:
input = torch.randint(ntokens, (1, 1), dtype=torch.long).to(device)

words = []
probs = []

with torch.no_grad():  # no tracking history
    for i in range(args_words):
        output, hidden = model(input, hidden)
        
        word_weights = output.squeeze().div(args_temperature).exp().cpu()
        word_idx = torch.multinomial(word_weights, 1)[0]
        input.fill_(word_idx)
        word = corpus.dictionary.idx2word[word_idx]
        
        # We replace <unk> and <eos> with * to get a cleaner look, but thats just
        # personal preference
        if(word == "<unk>" or word == "<eos>"):
          word = "*"

        print(word + ('\n' if i % 20 == 19 else ' '),end='')
        
        # We also create arrays with the generated words and their probability 
        # to be used for visualizing them in a tool that we created for this
        # purpose
        words.append(word)
        probs.append(output.squeeze()[word_idx].data.tolist())

  result = self.forward(*input, **kwargs)


used in the improved asia * * producers play its * now * themselves of its * as they were
off tag says * stein president and chief executive of * group inc. a oil and gas company * this
june the executives for instance have * their down * the new york 's main real-estate refinery are * and
* these rumors last year on my technologies and ballot * he dressed himself a new mexico * in *
* and business trying to get wine is english at a * time to stay today * mr. sherman what
's more those swings might have counted * and a * cheaper ducks but a * pressure consistent with the
seniors and investment privacy as low as sophisticated parties have turned when they are * there are charities * who
can afford as the plants * in recent weeks on the bargain this year he has a * reputation for
the first of these things * filling new time for giants now if i give it a * there will
be a slew of real rooms * investors you support the big board to fit * * he adds he
wants years later after much bear stear

Print the words and probabilities for use in a [vizualization tool](https://github.com/mikkelbrusen/text-weight-visualizer) we created 

In [None]:
print(words)
print(probs)

['used', 'in', 'the', 'improved', 'asia', '*', '*', 'producers', 'play', 'its', '*', 'now', '*', 'themselves', 'of', 'its', '*', 'as', 'they', 'were', 'off', 'tag', 'says', '*', 'stein', 'president', 'and', 'chief', 'executive', 'of', '*', 'group', 'inc.', 'a', 'oil', 'and', 'gas', 'company', '*', 'this', 'june', 'the', 'executives', 'for', 'instance', 'have', '*', 'their', 'down', '*', 'the', 'new', 'york', "'s", 'main', 'real-estate', 'refinery', 'are', '*', 'and', '*', 'these', 'rumors', 'last', 'year', 'on', 'my', 'technologies', 'and', 'ballot', '*', 'he', 'dressed', 'himself', 'a', 'new', 'mexico', '*', 'in', '*', '*', 'and', 'business', 'trying', 'to', 'get', 'wine', 'is', 'english', 'at', 'a', '*', 'time', 'to', 'stay', 'today', '*', 'mr.', 'sherman', 'what', "'s", 'more', 'those', 'swings', 'might', 'have', 'counted', '*', 'and', 'a', '*', 'cheaper', 'ducks', 'but', 'a', '*', 'pressure', 'consistent', 'with', 'the', 'seniors', 'and', 'investment', 'privacy', 'as', 'low', 'as',