In [0]:
# Copyright 2019 Google Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#Predicting Movie Review Sentiment with BERT on TF Hub

If you’ve been following Natural Language Processing over the past year, you’ve probably heard of BERT: Bidirectional Encoder Representations from Transformers. It’s a neural network architecture designed by Google researchers that’s totally transformed what’s state-of-the-art for NLP tasks, like text classification, translation, summarization, and question answering.

Now that BERT's been added to [TF Hub](https://www.tensorflow.org/hub) as a loadable module, it's easy(ish) to add into existing Tensorflow text pipelines. In an existing pipeline, BERT can replace text embedding layers like ELMO and GloVE. Alternatively, [finetuning](http://wiki.fast.ai/index.php/Fine_tuning) BERT can provide both an accuracy boost and faster training time in many cases.

Here, we'll train a model to predict whether an IMDB movie review is positive or negative using BERT in Tensorflow with tf hub. Some code was adapted from [this colab notebook](https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb). Let's get started!

In [1]:
from sklearn.model_selection import train_test_split
import pandas as pd
import tensorflow as tf
import tensorflow_hub as hub
from datetime import datetime

In [2]:
tf.__version__

'1.14.0'

In addition to the standard libraries we imported above, we'll need to install BERT's python package.

In [3]:
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization

W0709 10:59:24.946227 16944 deprecation_wrapper.py:119] From C:\Users\xiangyangcao\AppData\Local\Continuum\anaconda3\lib\site-packages\bert\optimization.py:87: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.



Below, we'll set an output directory location to store our model output and checkpoints. This can be a local directory, in which case you'd set OUTPUT_DIR to the name of the directory you'd like to create. If you're running this code in Google's hosted Colab, the directory won't persist after the Colab session ends.

Alternatively, if you're a GCP user, you can store output in a GCP bucket. To do that, set a directory name in OUTPUT_DIR and the name of the GCP bucket in the BUCKET field.

Set DO_DELETE to rewrite the OUTPUT_DIR if it exists. Otherwise, Tensorflow will load existing model checkpoints from that directory (if they exist).

In [4]:
# Set the output directory for saving model file
# Optionally, set a GCP bucket location

OUTPUT_DIR = 'OUTPUT_DIR_NAME'#@param {type:"string"}
#@markdown Whether or not to clear/delete the directory and create a new one
DO_DELETE = False #@param {type:"boolean"}
#@markdown Set USE_BUCKET and BUCKET if you want to (optionally) store model output on GCP bucket.
USE_BUCKET = False #@param {type:"boolean"}
BUCKET = 'BUCKET_NAME' #@param {type:"string"}

if USE_BUCKET:
  OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET, OUTPUT_DIR)
  from google.colab import auth
  auth.authenticate_user()

if DO_DELETE:
  try:
    tf.gfile.DeleteRecursively(OUTPUT_DIR)
  except:
    # Doesn't matter if the directory didn't exist
    pass
tf.gfile.MakeDirs(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))


***** Model output directory: OUTPUT_DIR_NAME *****


#Data

First, let's download the dataset, hosted by Stanford. The code below, which downloads, extracts, and imports the IMDB Large Movie Review Dataset, is borrowed from [this Tensorflow tutorial](https://www.tensorflow.org/hub/tutorials/text_classification_with_tf_hub).

In [5]:
from tensorflow import keras
import os
import re

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)
  
  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))
  
  return train_df, test_df


In [6]:
train, test = download_and_load_datasets()

Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz


To keep training fast, we'll take a sample of 5000 train and test examples, respectively.

In [7]:
train = train.sample(5000)
test = test.sample(5000)

In [8]:
train.columns

Index(['sentence', 'sentiment', 'polarity'], dtype='object')

For us, our input data is the 'sentence' column and our label is the 'polarity' column (0, 1 for negative and positive, respecitvely)

In [9]:
DATA_COLUMN = 'sentence'
LABEL_COLUMN = 'polarity'
# label_list is the list of labels, i.e. True, False or 0, 1 or 'dog', 'cat'
label_list = [0, 1]

#Data Preprocessing
We'll need to transform our data into a format BERT understands. This involves two steps. First, we create  `InputExample`'s using the constructor provided in the BERT library.

- `text_a` is the text we want to classify, which in this case, is the `Request` field in our Dataframe. 
- `text_b` is used if we're training a model to understand the relationship between sentences (i.e. is `text_b` a translation of `text_a`? Is `text_b` an answer to the question asked by `text_a`?). This doesn't apply to our task, so we can leave `text_b` blank.
- `label` is the label for our example, i.e. True, False

In [10]:
# Use the InputExample class from BERT's run_classifier code to create examples from the data
train_InputExamples = train.apply(lambda x: bert.run_classifier.InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this example
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                                   text_a = x[DATA_COLUMN], 
                                                                   text_b = None, 
                                                                   label = x[LABEL_COLUMN]), axis = 1)

In [110]:
train_InputExamples.head()

17399    <bert.run_classifier.InputExample object at 0x...
20260    <bert.run_classifier.InputExample object at 0x...
3596     <bert.run_classifier.InputExample object at 0x...
23436    <bert.run_classifier.InputExample object at 0x...
23545    <bert.run_classifier.InputExample object at 0x...
dtype: object

Next, we need to preprocess our data so that it matches the data BERT was trained on. For this, we'll need to do a couple of things (but don't worry--this is also included in the Python library):


1. Lowercase our text (if we're using a BERT lowercase model)
2. Tokenize it (i.e. "sally says hi" -> ["sally", "says", "hi"])
3. Break words into WordPieces (i.e. "calling" -> ["call", "##ing"])
4. Map our words to indexes using a vocab file that BERT provides
5. Add special "CLS" and "SEP" tokens (see the [readme](https://github.com/google-research/bert))
6. Append "index" and "segment" tokens to each input (see the [BERT paper](https://arxiv.org/pdf/1810.04805.pdf))

Happily, we don't have to worry about most of these details.




To start, we'll need to load a vocabulary file and lowercasing information directly from the BERT tf hub module:

In [11]:
# This is a path to an uncased (all lowercase) version of BERT
BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])
      
  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

W0709 11:03:47.926119 16944 deprecation_wrapper.py:119] From C:\Users\xiangyangcao\AppData\Local\Continuum\anaconda3\lib\site-packages\bert\tokenization.py:125: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.



Great--we just learned that the BERT model we're using expects lowercase data (that's what stored in tokenization_info["do_lower_case"]) and we also loaded BERT's vocab file. We also created a tokenizer, which breaks words into word pieces:

In [12]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

In [15]:
tokenizer.tokenize("This here's an example of using the BERT tokenizer")

['this',
 'here',
 "'",
 's',
 'an',
 'example',
 'of',
 'using',
 'the',
 'bert',
 'token',
 '##izer']

Using our tokenizer, we'll call `run_classifier.convert_examples_to_features` on our InputExamples to convert them into features BERT understands.

In [16]:
# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
# Convert our train and test features to InputFeatures that BERT understands.
train_features = bert.run_classifier.convert_examples_to_features(train_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

#Creating a model

Now that we've prepared our data, let's focus on building a model. `create_model` does just this below. First, it loads the BERT tf hub module again (this time to extract the computation graph). Next, it creates a single new layer that will be trained to adapt BERT to our sentiment task (i.e. classifying whether a movie review is positive or negative). This strategy of using a mostly trained model is called [fine-tuning](http://wiki.fast.ai/index.php/Fine_tuning).

In [25]:
def create_model(is_predicting, input_ids, input_mask, segment_ids, labels,
                 num_labels):
  """Creates a classification model."""

  bert_module = hub.Module(
      BERT_MODEL_HUB,
      trainable=True)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)

  # Use "pooled_output" for classification tasks on an entire sentence.
  # Use "sequence_outputs" for token-level output.
  output_layer = bert_outputs["pooled_output"]

  hidden_size = output_layer.shape[-1].value

  # Create our own layer to tune for politeness data.
  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):

    # Dropout helps prevent overfitting
    output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    # Convert labels into one-hot encoding
    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    predicted_labels = tf.squeeze(tf.argmax(log_probs, axis=-1, output_type=tf.int32))
    # If we're predicting, we want predicted labels and the probabiltiies.
    if is_predicting:
      return (predicted_labels, log_probs)

    # If we're train/eval, compute loss between predicted and actual label
    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)
    return (loss, predicted_labels, log_probs)


Next we'll wrap our model function in a `model_fn_builder` function that adapts our model to work for training, evaluation, and prediction.

In [26]:
# model_fn_builder actually creates our model function
# using the passed parameters for num_labels, learning_rate, etc.
def model_fn_builder(num_labels, learning_rate, num_train_steps,
                     num_warmup_steps):
  """Returns `model_fn` closure for TPUEstimator."""
  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]

    is_predicting = (mode == tf.estimator.ModeKeys.PREDICT)
    
    # TRAIN and EVAL
    if not is_predicting:

      (loss, predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      train_op = bert.optimization.create_optimizer(
          loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu=False)

      # Calculate evaluation metrics. 
      def metric_fn(label_ids, predicted_labels):
        accuracy = tf.metrics.accuracy(label_ids, predicted_labels)
        f1_score = tf.contrib.metrics.f1_score(
            label_ids,
            predicted_labels)
        auc = tf.metrics.auc(
            label_ids,
            predicted_labels)
        recall = tf.metrics.recall(
            label_ids,
            predicted_labels)
        precision = tf.metrics.precision(
            label_ids,
            predicted_labels) 
        true_pos = tf.metrics.true_positives(
            label_ids,
            predicted_labels)
        true_neg = tf.metrics.true_negatives(
            label_ids,
            predicted_labels)   
        false_pos = tf.metrics.false_positives(
            label_ids,
            predicted_labels)  
        false_neg = tf.metrics.false_negatives(
            label_ids,
            predicted_labels)
        return {
            "eval_accuracy": accuracy,
            "f1_score": f1_score,
            "auc": auc,
            "precision": precision,
            "recall": recall,
            "true_positives": true_pos,
            "true_negatives": true_neg,
            "false_positives": false_pos,
            "false_negatives": false_neg
        }

      eval_metrics = metric_fn(label_ids, predicted_labels)

      if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode=mode,
          loss=loss,
          train_op=train_op)
      else:
          return tf.estimator.EstimatorSpec(mode=mode,
            loss=loss,
            eval_metric_ops=eval_metrics)
    else:
      (predicted_labels, log_probs) = create_model(
        is_predicting, input_ids, input_mask, segment_ids, label_ids, num_labels)

      predictions = {
          'probabilities': log_probs,
          'labels': predicted_labels
      }
      return tf.estimator.EstimatorSpec(mode, predictions=predictions)

  # Return the actual model function in the closure
  return model_fn


In [27]:
# Compute train and warmup steps from batch size
# These hyperparameters are copied from this colab notebook (https://colab.sandbox.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb)
BATCH_SIZE = 32
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 3.0
# Warmup is a period of time where hte learning rate 
# is small and gradually increases--usually helps training.
WARMUP_PROPORTION = 0.1
# Model configs
SAVE_CHECKPOINTS_STEPS = 500
SAVE_SUMMARY_STEPS = 100

In [28]:
# Compute # train and warmup steps from batch size
num_train_steps = int(len(train_features) / BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)

In [29]:
# Specify outpit directory and number of checkpoint steps to save
run_config = tf.estimator.RunConfig(
    model_dir=OUTPUT_DIR,
    save_summary_steps=SAVE_SUMMARY_STEPS,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS)

In [30]:
model_fn = model_fn_builder(
  num_labels=len(label_list),
  learning_rate=LEARNING_RATE,
  num_train_steps=num_train_steps,
  num_warmup_steps=num_warmup_steps)

estimator = tf.estimator.Estimator(
  model_fn=model_fn,
  config=run_config,
  params={"batch_size": BATCH_SIZE})


Next we create an input builder function that takes our training feature set (`train_features`) and produces a generator. This is a pretty standard design pattern for working with Tensorflow [Estimators](https://www.tensorflow.org/guide/estimators).

In [31]:
# Create an input function for training. drop_remainder = True for using TPUs.
train_input_fn = bert.run_classifier.input_fn_builder(
    features=train_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=True,
    drop_remainder=False)

Now we train our model! For me, using a Colab notebook running on Google's GPUs, my training time was about 14 minutes.

### traning here on cpu is freaking slow

In [32]:
print(f'Beginning Training!')
current_time = datetime.now()
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
print("Training took time ", datetime.now() - current_time)

W0709 14:16:36.536499 16944 deprecation.py:323] From C:\Users\xiangyangcao\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\training\training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.


Beginning Training!


W0709 14:16:42.099656 16944 deprecation.py:506] From <ipython-input-25-ca03218f28a6>:34: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
W0709 14:16:42.132150 16944 deprecation_wrapper.py:119] From C:\Users\xiangyangcao\AppData\Local\Continuum\anaconda3\lib\site-packages\bert\optimization.py:27: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.

W0709 14:16:42.133633 16944 deprecation_wrapper.py:119] From C:\Users\xiangyangcao\AppData\Local\Continuum\anaconda3\lib\site-packages\bert\optimization.py:32: The name tf.train.polynomial_decay is deprecated. Please use tf.compat.v1.train.polynomial_decay instead.

W0709 14:16:42.138622 16944 deprecation.py:323] From C:\Users\xiangyangcao\AppData\Local\Continuum\anaconda3\lib\site-pack

Training took time  4:12:12.995158


Now let's use our test data to see how well our model did:

In [40]:
test_input_fn = run_classifier.input_fn_builder(
    features=test_features,
    seq_length=MAX_SEQ_LENGTH,
    is_training=False,
    drop_remainder=False)

In [41]:
estimator.evaluate(input_fn=test_input_fn, steps=None)

  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


{'auc': 0.86760396,
 'eval_accuracy': 0.8676,
 'f1_score': 0.8663166,
 'false_negatives': 356.0,
 'false_positives': 306.0,
 'loss': 0.5470487,
 'precision': 0.875153,
 'recall': 0.85765696,
 'true_negatives': 2193.0,
 'true_positives': 2145.0,
 'global_step': 468}

Now let's write code to make predictions on new sentences:

In [42]:
def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  predictions = estimator.predict(predict_input_fn)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

In [43]:
pred_sentences = [
  "That movie was absolutely awful",
  "The acting was a bit lacking",
  "The film was creative and surprising",
  "Absolutely fantastic!"
]

In [44]:
predictions = getPrediction(pred_sentences)

Voila! We have a sentiment classifier!

In [45]:
predictions

[('That movie was absolutely awful',
  array([-1.1609012e-03, -6.7591658e+00], dtype=float32),
  'Negative'),
 ('The acting was a bit lacking',
  array([-2.1196771e-03, -6.1575608e+00], dtype=float32),
  'Negative'),
 ('The film was creative and surprising',
  array([-6.1112604e+00, -2.2201908e-03], dtype=float32),
  'Positive'),
 ('Absolutely fantastic!',
  array([-6.1465464e+00, -2.1431115e-03], dtype=float32),
  'Positive')]

In [46]:
pred_sentences = [
  "Good",
   "Bad",
    "NA"
]

getPrediction(pred_sentences)

[('Good', array([-2.4273534, -0.0924115], dtype=float32), 'Positive'),
 ('Bad', array([-4.7138515e-03, -5.3596144e+00], dtype=float32), 'Negative'),
 ('NA', array([-2.287565  , -0.10704336], dtype=float32), 'Positive')]

In [48]:
f = "March 22, 2019 | Equity Research \n \n \n \n \n Wells Fargo Technology Weekly \nSemi Thoughts, Weak S. Korea Memory Data, \nNetworking Qtrly Review, & More \n \n \n\uf0b7  Strong  Semiconductor  Outperformance  Resulting  in  Increased \nInvestor Angst Ahead of 1Q19 Earnings Season?  With the recent \nstrong  performance  in  semiconductors  (SOX  +1.3%  and  +22.1%  last \nweek  and  YTD vs.  S&P +1% and  +13%, respectively) we have begun \nto  receive  increasing  investor  questions  on  the  set-up  into  1Q19 \nearnings  season.   On  a  near-term  overall  basis,  we  think  some  profit \ntaking  in  semis  could  materialize  as  a  more  cautious  stance  on  1Q19 \nearnings  season  materializes  -  most  investors  we  have  spoken  with \nexpect choppy / weak 1Q19 results and cautious 2Q19 outlooks;  hope \nof  a  materializing  2H2019  recovery  remains  key  focus.   That  said,  we \nthink  it  is  becoming  increasingly  important  to  consider  company-\nspecific dynamics  – we continue to highlight:  (1) AMD: Shares rallied \nlast week (+15%) on Google’s Stadia (Project Stream) game streaming \nannouncement  confirming  the  use  of  AMD’s  Radeon  GPUs  (a  4Q18 \ndriver)  and  CPUs  –  implied  use  of  AMD’s  7nm  Rome  EPYC  CPUs \nconsidered  to  be  new  news.  We  remain  positive  on  AMD’s  positioning \nfor  datacenter  momentum  into  mid/2H2019  (note:  we  think  some \nRome  shipments  could  commence  in  2Q19),  coupled  with  continued \nbroadening PC adoption of AMD’s Ryzen processors through 2019.  (2) \nNVIDIA:  NVIDIA  shares  were  up  5.5%  this  week  (+34%  YTD)  as  we \ncontinue  to  see  increasing  evidence  of  a  completed  gaming  channel \ninventory burn-through, coupled positive comments on NVIDIA’s Turing \nRTX  ramp  following  full  product  suite  availability  commencing  in  late-\nJanuary.  We  also  remain  positive  on  NVIDIA’s  reacceleration  of \ndatacenter  growth  into  2H2019  and  the  strategic  merits  /  accretion \nfrom the company’s acquisition of Mellanox. \n\uf0b7  Weak  South  Korea  Trade  Data:  (1)  NAND.  Exports  of  NAND  Flash \non a US$ basis in South Korea declined 41% y/y in Feb.; -34% y/y for \nJan + Feb. Unit exports were -8% y/y in Jan + Feb. We estimate total \nexports down ~40% y/y in 1Q19; units -10% y/y. (2) DRAM. Exports \n+ imports of DRAM declined 33% y/y in Feb.; -27% y/y for Jan + Feb. \nUnit  exports  +  imports  are  -22%  y/y  in  Feb;  -20%  y/y  in  Jan  +  Feb. \nWe  estimate  total  exports  +  imports  down  ~30%  y/y  in  1Q19;  units \ndown 22% y/y.  \n\uf0b7  Networking  Quarterly  Review  (See  Pgs.  7-12):  Although \nbackwards  looking,  we  thought  we  would  highlight  IDC’s  Ethernet \nSwitching  quarterly  data  published  this  week:  (1)  The  datacenter \nEthernet switching market grew 7% y/y to $3.1B in 4Q18; Public Cloud \n+18.5% y/y (46% of total), while Enterprise was -1% y/y and Service \nProvider was +3% y/y. (2) 100G Positioning: IDC estimates Arista’s \n100G  port  share  at  24%  in  4Q18  (flat  q/q),  compared  to  Cisco  and \nJuniper  at  a  ~16%  and  3%  (vs.  19%  and  3%  in  3Q18).  ODM, \nAccton/EdgeCore  continues  to  have  the  highest  share  at  35%.  (4) \nCisco  Campus  Momentum:  IDC’s  data  highlights  accelerating \nmomentum for Cisco’s campus switching business - +23% y/y in 4Q18, \ncompared to -4%, +6%, -5%, and +8% y/y in 4Q17, 1Q18, 2Q18, and \n3Q18  respectively.  We  remain  focused  on  Cisco’s  Catalyst  9k  refresh, \nwhich was noted as growing double digits y/y in the Jan. ’19 quarter.  \n \n \nPlease  see  page  32  for  rating  definitions,  important  disclosures  and \nrequired analyst certifications. All estimates/forecasts are as of  03/22/19 \nunless otherwise stated. 03/22/19 14:34:38 ET \n \nWells Fargo Securities, LLC does and seeks to do business with companies covered \nin  its  research  reports.    As  a  result,  investors  should  be  aware  that  the  firm  may \nhave  a  conflict  of  interest  that  could  affect  the  objectivity  of  the  report  and \ninvestors  should  consider  this  report  as  only  a  single  factor  in  making  their \ninvestment decision. \n \nIT Hardware & \nCommunications \nNetworking \n \n \nAaron Rakers, CFA \nS e n i o r   A n a l y s t | 3 1 4 - 8 7 5 - 2 5 0 8  \na a r o n . r a ke r s @ w e l l s f a r g o . c o m  \nJoe Quatrochi, CFA \nA s s o c i a t e   A n a l y s t | 3 1 4 - 8 7 5 - 2 0 5 5  \nj o e . q u a t r o c h i @ w e l l s f a r g o . c o m  \nJake Wilhelm, CFA, CPA \nA s s o c i a t e   A n a l y s t | 3 1 4 - 8 7 5 - 2 5 0 2  \nj a k e . w i l h e l m @ w e l l s f a r g o . c o m  \nMichael Tsvetanov \nA s s o c i a t e   A n a l y s t | 3 1 4 - 8 7 5 - 2 5 5 8  \nm i c h a e l . t s v e t a n o v @ w e l l s f a r g o . c o m    \nIT Hardware & Communications Networking \nEquity Research \nHighlighted Industry News / Thoughts: \n \nDRAMeXchange Sees Ongoing Price Pressures in DRAM; Some Return to Hyperscale Purchases \nin March \n \nDRAMeXchange  published  a  report  outlining  its  expectation  of  ongoing  price  pressures  in  the  DRAM \nmarket.  Key highlights include:  \n \n\uf0b7  DRAMeXchange expects overall DRAM prices to fall approximately 20% q/q in 1Q19 (no surprise). 2Q19 \nand 3Q19 prices are estimated to decline by 15%-20% and ~10% q/q, respectively. \n\uf0b7  This  includes  server  DRAM prices  estimated  to  decline  by 20%  q/q  and  ~10%  q/q  in 2Q19  and 3Q19, \nrespectively; PC DRAM prices estimated to decline at a similar rate.  eMCP pricing or mobile devices are \nestimated  to  decline  by  10%-20%  q/q  in  2Q19  and  5%-10%  q/q  in  2Q190  and  3Q19,  respectively; \ndiscrete mobile DRAM prices are expected to decline by 5%-10% q/q in both 2Q19 and 3Q19. \n\uf0b7  DRAMeXchange  estimates  that  inventory  levels  for  DRAM  suppliers  have  increased  to  over  6  weeks \n(note:  Micron  reported  134  days  of  total  inventory  on  a  dollar  basis  exiting  their  February  quarter).  \nServer & PC customers were noted to be sitting on over 7 weeks of DRAM inventory.   \n\uf0b7  1Ynm  DRAM  is  expected  to  be  a  key  driver  of  continued  bit  supply  growth;  however,  DRAMeXchange \nalso  notes  that  DRAM  suppliers  are  continuing  to  adopt  large  price  reductions  in  order  to  stimulate \ndemand.   \n\uf0b7  The  write-up  notes  that  DRAM  content  per  box  expansion  is  expected  to  perform  lower  than  what  was \nseen in 2018 across all product categories.  \n\uf0b7  DRAMeXchange notes that a few N. American hyperscale datacenter customers started to return to place \norders in March.  \n \nS. Korea Trade Data: Jan + Feb NAND Exports -34% Y/Y; DRAM Imports + Exports -27% Y/Y  \nOn a US$ basis, exports of NAND Flash in South Korea declined 41% y/y in February (-10% m/m); down \n34%  y/y  for  January  +  February.  Imports  of  NAND  Flash  declined  42.5%  y/y  for  January  +  February. \nFrom a quantity basis, NAND Flash exports are down 5% y/y in February and down 8% y/y in January + \nFebruary.  Using  the  5  year  average  January  +  February  contribution,  we  would  be  left  to  estimate  total \nexports down nearly 40% y/y in 1Q19; units down 10% y/y. This would imply average 1Q19 pricing down \n18% sequentially (-30% y/y).  \n \nExports  +  imports  of  DRAM  in  South  Korea  declined  33%  y/y  in  February  (vs.  -21.5%  y/y  in  January);  \n-27% y/y for January + February. Exports of DRAM decreased 37.5% y/y for January + February. From a \nquantity basis, DRAM exports + imports are  -22% y/y in February and -20% y/y in January + February. \nUsing  the 5 year  average  January  + February  contribution,  we would be  left  to  estimate  total exports + \nimports down nearly 30% y/y in 1Q19; units down 22% y/y. While mix should be considered, this would \nimply average 1Q19 pricing down 17% sequentially (-9% y/y). \n \n \n \n2 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \nEquity Research \n \n \n \n \n \n \n \n \n \n \nWells Fargo Securities, LLC | 3 \nIT Hardware & Communications Networking \nEquity Research \nNAND Spot Pricing: TLC Composite Down 0.6%, 256Gb and 512Gb Down 1.5% and 3.2% \n\uf0b7  NAND  spot  pricing  indicated  that  tracked  TLC  was  down  an  average  of  0.6%  w/w  following  a  0.6% \ndecrease  last  week.  The  decline  was  led  by  128Gb  TLC  that  was  down  1.3%  while  256Gb  and  512Gb \nTLC, not yet included in our composite, were down 1.5% and 3.2%, respectively. Year-to-date 128Gb, \n256Gb, and 512Gb TLC have declined 5%, 13%, and 21%, respectively.  \n \n\uf0b7  For 1Q19, TLC spot pricing is now down 5% sequentially and 46.6% y/y while MLC spot pricing is down \n1.3% sequentially and 10.1% y/y. High capacity 128Gb TLC is down 5.7% sequentially and 55.8% y/y.  \n \n \n \n \n \n \nDRAM Spot Pricing: Tracking Down 12% Q/Q for 1Q19 \n \n\uf0b7   DRAM spot pricing was down an average of 0.5% this week driven by declines in DDR3 2Gb 128Mx16  \n(-1.5%) and DDR3 4Gb 256Mx16 (1.4%). Inspectrum noted that inquiries were limited during the week \nand the market was sluggish. 8Gb DDR4 (not included in the composite) was down 2.9% w/w while 8Gb \nDDR4 WB was down 1.7%.   \n \n\uf0b7  For 1Q19, overall DRAM spot pricing is now down 12.3% sequentially and 28.1% y/y following a 12.6% \nsequential decline in 4Q18. High capacity DDR4 4Gb 512Mx8 is down 11.6% sequentially and 40.2% y/y \nthus far in the quarter.  \n \n4 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \nEquity Research \n \n \n \n \nDigiTimes Article Highlights SSD Demand Elasticity via Declining NAND Flash Pricing \nDigiTimes this week published an article highlighting the expectation of increasing enterprise SSD demand \nelasticity amid continued declines in NAND Flash pricing. The article notes that SSD providers are pushing \n16TB  capacities  (e.g.,  Silicon  Motion,  or  SIMO’s,  pushing  of  new  enterprise  SSD  controllers  supporting \n16TB  of  capacity;  volume  shipments  commencing  in  2H2019).  In  addition  to  pricing  dynamics,  we \ncontinue to believe the adoption of NVMe in the enterprise market opens up the system architectures to \nincorporate / adopt higher NAND Flash capacity density.  The article (along with our industry checks) have \nincreasingly  pointed  to  increasing  adoption  of 512GB client  SSD  capacities  over  the  past  several  months \nas  price  points  have  declined  to  levels  of  256GB  SSDs  (note:  industry  estimates  show  ~260-280GB \naverage capacity for client SSDs in the most recent quarter).   \n \nIn terms of NAND Flash technology, the article reports that NAND Flash vendors are now realizing 90%+ \nyield levels on 64-Layer 3D NAND and are increasingly focused on moving to 96-Layer 3D NAND. Western \nDigital  has  noted  that  it  expects  96-Layer  3D  NAND  production  cost  crossover  to  be  achieved  in  the \ncurrent quarter; bit production crossover targeted for C4Q19.  Micron this week also positively highlighted \npositive traction in 96-Layer 3D NAND; however, the company also noted that it would not meaningfully \ncommercialize its next-generation 3D NAND (likely 128-Layer) given the cost structure increase attributed \nto the move to replacement (charge) trap technology vs. Micron’s thus far use of floating gate.   \n \nDigiTimes,  referencing  DRAMeXchange,  notes  that  NAND Flash  contract  prices continue  to  fall;  however, \nthe  rate  of  declines  are  moderating  in  2Q19  versus  the  -20%  q/q  decline  in  1Q19.   Like  our  recent \nmeeting with Western Digital management, the article notes that NAND prices are expected to continue to \nmoderate into 2H2019 as vendors realize moderating output levels into mid/2H2019.   \nWells Fargo Securities, LLC | 5 \nIT Hardware & Communications Networking \nEquity Research \n \nDigiTimes Highlights Increasing PCIe (NVMe) SSD Adoption \nIn addition to the aforementioned thoughts on NAND Flash pricing and demand elasticity trends, DigiTimes \nalso published an article discussing the increasing adoption of PCIe (NVMe) SSDs.  A few of the key points \ninclude:  \n \n\uf0b7  Based on industry sources, global shipments of SSDs are expected to increase 20%-25% y/y in 2019.  \n \n\uf0b7  PCIe  (NVMe)  SSDs  are  expected  to  account  for  ~50%  of  total  shipments  –  including  both  client  and \nenterprise.  \n \n\uf0b7  512GB  PCIe  client  SSD  prices  have  declined  11%  q/q  to  $55  in  1Q19,  which  compares  to  a  9%  q/q \ndecline in SATA SSD prices – resulting in an ongoing narrowing of the price delta from the ~30% level \nseen in 2018. \n \n\uf0b7  Avg. unit prices for 512GB capacities have declined to levels comparable with 256GB SSDs a year ago; \nlarger price declines are expected in the 512GB to 1TB capacity points through 2019 – i.e., bit demand \nelasticity. \n \n\uf0b7  The industry is seeing accelerating adoption of PCIe Gen 3.0 SSDs in the notebook market.  \n \n \nThe  Next  Platform  Discussion  on  Hyperscale  Datacenter  Architecture  Trends  –  Interview  with \nMicrosoft GM of Azure, Kashagra Vaid \n \nArticle  Link:  https://www.nextplatform.com/2019/03/20/on-the-hot-seat-in-the-hyperscale-\ndatacenter/  \n \nThe  Next  Platform  published  an  interesting  and  insightful  article  following  an  interview  with  Microsoft \nDistinguished Engineer & General Manager of Microsoft Azure, Kushagra Vaid.  The discussion touches on \nMicrosoft’s adoption of OCP based servers (Project Olympus), views on the evolution toward specialized / \npurpose-built  silicon,  thoughts  on  the  cooling  as  an  increasingly  significant  challenge  in  hyperscale  data \ncenters,  and  the  need  to  move  to  silicon  photonics.    Below  we  summarize  some  of  the  interesting \ntakeaways:  \n \n\uf0b7  Mr.  Vaid  noted  that  the  huge  majority  (+90%)  of  the  hardware  Microsoft  purchases  is  based  on  the \ncompany’s Open Compute specifications.  \n \n\uf0b7  Four  and  eight-socket  servers  for  hierarchical  storage  are  not  currently  covered  by  the  OCP \nspecifications.  These  systems  require  head  nodes  with  Fiber  Channel  (FC)  connectivity,  as  well  as \ninterface with tape subsystems for data retention. \n \n\uf0b7  The  article  highlighted  the  broad  adoption  of  configurations  based  on  Microsoft’s  Project  Olympus \nreference specifications.  \n \n\uf0b7  Microsoft  places  a  Cerberus  security  chip  on  every  motherboard,  which  provides  scalability  across  a \nbroad-based supply chain.  \n \n\uf0b7  All new server capacity being deployed in Azure is using Project Olympus servers; noting that it takes a \nwhile to decommission the servers the company has previously deployed from HP Enterprise and Dell.   \n \n\uf0b7  Microsoft  is  collaborating  with  Facebook  on  the  new  OCP  Accelerator  Modules  (OAM)  as  part  of  the \ncompany’s Zion server design.  The article notes that this could allow Microsoft and Facebook to deploy \na chassis with 16 GPUs versus eight GPUs.  \n \n\uf0b7  Cambrian Explosion: The Next Platform notes that we are in a Cambrian explosion of computing, while \nalso highlighting that Microsoft has an effort to have up to half of their computing done on Arm-based \nCPUs. In response, Mr. Vaid points out that Arm CPUs are multithreaded and work very well for certain \ntypes  of  workloads;  x86  CPUs  are  better  suited  for  single-threaded  performance  –  the  datacenter  is \nbecoming more heterogeneous.   \n \n\uf0b7  Mr.  Vaid  endorsed  the  view  that  we  are  moving  to  more  purpose-built  silicon  requirements  given  the \nslowdown  in  Moore’s  Law.  He  noted  the  view  that  if  the  economic  value  justifies  the  need  for \nspecialization  for  certain  workloads  than  the  industry  (and  Microsoft)  will  focus  on  this  architectural \nneed.  \n \n\uf0b7  When asked about the positioning of startups focusing on specialized silicon development, Mr. Vaid it is \nstill too early to tell how it plays out – none of them are in volume production yet.  \n \n\uf0b7  The article notes that power consumption continues to be an increasingly important factor  – CPUs now \nrequire  +200  Watts  of  power  consumption.  Power  density  increases  also  coincides  /  drives  increasing \nneeds for cooling – no one can move enough air to cool their infrastructure.  Mr. Vaid noted that cooling \nis going to be a big issue to overcome over the next 2-3 years.   \n \n\uf0b7  Alternative cooling techniques could include immersive liquid cooling, or heat / cold plates – noting that \nthe  latter  solution  could be more  attractive  as  it would  not require  radical  changes  in the  data  center. \n6 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \nEquity Research \nThis is going to be a necessity given the deployment of AI silicon across the data center. We think this is \nwhy  most  of  the  next-generation  silicon  providers  have  emphasized  competitive  benchmarks  around \nTOPS per Watt (TOPS/W).   \n \n\uf0b7  Cooling challenges have resulted in racks within a datacenter to be underutilized – i.e., unable to fill to \nfull  capacity  given  that  open  space  is  required  for  air  based  cooling.  The  article  also  notes  that  this \nresults in stranded / unused ports for top-of-rack switches.   \n \n\uf0b7  Mr. Vaid notes: “Today, the sweet spot is between 10 kilowatts and 15 kilowatts, based on commodity \nparts with air cooling. You can probably go up to 25 kilowatts or so, and sort of be alright, but now you \nneed more  copper  and  that  cost  starts going  up.  Beyond  that,  I  don’t  think  the  industry  has  solutions \nthat are available at a broader level  – excepting supercomputers and other exotic equipment.”   In our \nopinion, this is an interesting comment as it relates to Cray, which architecturally provided high-density \nsupercomputers and has Microsoft as a customer.  \n \n\uf0b7  From a networking perspective, Mr. Vaid highlighted the industry need to move to optical versus copper, \nincluding silicon photonics / integrated optics as the industry moves beyond 100G.   \n \n \n4Q18 Data Center Ethernet Switch Market Review: ODM / Direct Gain Revenue Share in 100Gb \nas $/Port Declines Moderate; Strong Cisco Campus Growth at +24% Y/Y \nWhile backward looking, this week we received IDC’s 4Q18 updated data center Ethernet switching data. \nIn addition to Ethernet shipments by market vertical, below we illustrate IDC’s Ethernet data center ports \nshipped and revenue market share estimates by speed for 4Q18. \n \n\uf0b7  Data Center Switching Vertical Breakdown: IDC provides a breakdown of port shipments / revenue \nby vertical – continuing to highlight the proliferation of cloud data center.  \n \n \n \n\uf0d8 Cloud  Data  Center:  According  to  IDC  data,  cloud  data  center  switching  revenue  reached  $1.57 \nbillion in 4Q18 (+18.5% y/y) and was +9.5% sequentially. As illustrated below, Arista held a 22.2% \nmarket share in 4Q18, up from 19.6% a year ago, while Cisco held a 37.2% share (vs. 39.9% a year \nago).  Juniper’s  cloud  data  center  market  share  stood  at  5.7%  during  the  December  quarter, \ncompared  to  6.1%  and  7.1%  in  the  prior  and  year  ago  quarters,  respectively.  We  would  note \ncollective ODM Direct / whitebox revenue recorded strong growth of 39.3% y/y in 4Q18 (vs. +21.3% \ny/y  in  3Q18)  –  revenue  share  at 17.6% vs. 16.1%  in  3Q18.  Accton  /  EdgeCore continues  to be  the \nleading ODM at $168M of revenue in 4Q18 (+52% y/y).  \n\uf0d8 Communication Service  Provider:  Data  Center SP revenue  of $321  million  in  4Q18 was  up 2.7% \ny/y. This follows +7%, +10%, +4%, and 3% y/y in 4Q17, 1Q18, 2Q18, and 3Q18, respectively. Cisco \nheld  a  70%  share  in  the  December  quarter,  down  from  73%  in  the  prior  and  year  ago  quarters. \nArista’s  share  remained  unchanged  q/q  at  7.8%  and  compared  to  5.2%  a  year  ago.  Arista’s  SP \nrevenue  has  increased  78%  y/y  over  the  trailing  12  months  –  growing  from  $50M  to  $89M.  While \nmore  volatile  on  a  quarterly  basis,  we  would  also  highlight  Huawei’s  share  now  standing  at  15%  in \n4Q18, compared to 11.5% and 15% in the prior and year ago quarters.  \n \n\uf0d8 Enterprise: Enterprise data center switching revenue of $1.49 billion was down 1.4% y/y, compared \nto  +6.5%,  +12%,  +3.5%,  and  3.1%  y/y  in  4Q17,  1Q18,  2Q18,  and  3Q18,  respectively.  Cisco’s \nmarket leading share declined to 53.4% in 4Q18, compared to 55.1% in the September 2018 quarter \n(vs. 56% a year ago). Arista’s share declined sequentially to 8.4% in 4Q18, which compares to 9.1% \nand  8.4%  in  the  prior  and  year  ago  quarters.  HPE/H3C’s  share  increased  to  10.7%,  compared  to \n10.1%  and  8.9%  in  the  prior  and  year  ago  periods,  while  Huawei  is  estimated  at  11.8%  share,  vs. \n9.1% and 11.8% in the prior and year ago periods, respectively. \n\uf0b7  100G  Data  Center  Revenue  +3%  q/q  vs.  +20%  in  3Q18;  ODM  /  Direct  Gain  Share;  Cisco  / \nArista  Battle  for  #1:  IDC  reported  approximately  3.8  million  100Gb  ports  were  shipped  in  4Q18 \n(+136% y/y); revenue of $1.08 billion was up 68% y/y and 3% sequentially (vs. +20% q/q in 3Q18). \nArista  recaptured  100Gb  revenue  share  in  the  quarter  from  Cisco,  although  both  companies  remain \nwithin  a  percent  of  each  other.  Arista’s  share  now  stands  at  31.6%,  compared  to  a  record  36.5%  in \n1Q18  and  30.6%  in  the  prior  quarter;  with  port  shipment  share  roughly  flat  sequentially  at  24.1%, \ncompared  to  23.6%  in  the  year  ago period. Cisco’s  revenue  share  stood  at  31.4%, down  from 33.2% \nand 37% in the prior and year ago periods; port share at 15.6% vs. 18.6% in 3Q18 and 20.9% a year \nago. Juniper’s port ship share stood at  3.3%, compared to  3.7% a year ago (2.6% in 3Q18); revenue \nshare was down for the third quarter in a row  at 3.7%, and compared to 5.6% a year ago. We would \nnote that HPE’s China joint venture (H3C) lost revenue share in the December quarter and now stands \nat 1.2% (down from 4.5% last quarter), while Huawei increased its share to 7.3% vs. 4.8% in 3Q18. \n \nODM / Direct port ship market share continues to climb – now standing at 48.6% in 4Q18, compared to \n42.7% a year ago. ODM / Direct revenue share rose to an all-time high of 18.3%, which compares to \n15% a year ago. Within this, we would note that Accton / EdgeCore held a 13% and 35% revenue and \nports  shipped  market  share.  We  think  it  is  important  to  consider  the  significant  discount  on  a  $/port \nbasis offered  by ODM  /direct  –  standing  at  approximately $108  in  4Q18  (vs. $142  a year  ago),  which \nWells Fargo Securities, LLC | 7 \nIT Hardware & Communications Networking \nEquity Research \ncompares  to  Cisco  and  Arista  at  $577  and  $377,  respectively.  We  would  highlight  that  Cisco’s  $/Port \nincreased 7% sequentially, while Arista’s was relatively flat and Juniper saw a 30% q/q decline. ODM / \nDirect $/Port saw declines moderate in the quarter at -2% q/q vs. -7% q/q in both 2Q18 and 3Q18. \n \n\uf0b7  Cisco Campus Switching Revenue +23%  Y/y Driven by Catalyst 9000 Adoption: In addition to \nour analysis of the data center switching market, we would highlight IDC data indicates Cisco’s campus \nswitching business  increased  23%  y/y  in  4Q18,  compared to  -4%,  +6%,  -5%,  and +8% y/y  in 4Q17, \n1Q18, 2Q18, and 3Q18 respectively. This compares to total campus switching revenue growing 17% y/y \nin  4Q18  (vs.  +6%  and  -4%  in  the  prior  and  year  ago  quarters,  respectively)  –  Cisco’s  revenue  share \nstands  at  58%,  relatively  flat  from  the  prior  quarter  (vs.  55%  a  year  ago).  We  continue  to  focus  on \nCisco’s Catalyst 9000 campus product refresh (introduced in July 2017) as the company reported double \ndigit growth in campus switching driven by the Cat 9k product cycle in F2Q19. We would also highlight \nHPE’s revenue share gains in campus switching during the quarter – growing to 9.2% vs. 8.7% share in \n3Q18. \n \n \n \n \n8 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \nEquity Research \n \n \n \n \n \n \n \n \n \n \nWells Fargo Securities, LLC | 9 \nIT Hardware & Communications Networking \nEquity Research \n \n \n \n \n10 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \nEquity Research \n \n \n \n \n \n \n \nWells Fargo Securities, LLC | 11 \nIT Hardware & Communications Networking \nEquity Research \n \n \n \n \n \n \n12 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \nEquity Research \n \nIDC External Storage Forecast Continues to Call for -1% 2019; AFA +11% CAGR Thru 2023 \nLast  week  IDC  published  its  updated  (1Q19)  storage  forecast  –  increasing  the  firm’s  2019  revenue \nestimate  for  external  storage  by  an  additional  4%  following  a  3%  increase  in  December.  However,  we \nwould  attribute  this  to  higher  than  previously forecasted  2018  spending  – IDC  continues  to  expect  2019 \nspending  to  decline  1%  y/y  (vs.  +16%  y/y  in  2018).  IDC  increased  its  storage  revenue  forecast  by  an \naverage of 5% across the forecast period, now estimating external storage to grow at a 1% CAGR 2018-\n2023. Other highlights of this include:  \n \n \n\uf0b7  IDC raised its all-flash storage forecast by 1%, following a 3% increase in the December 2018 forecasts; \nIDC  increased  its  estimates  across  the  forecast  period  by  an  average  of  1%  (vs.  +4%  average  in  the \nDecember forecast). The firm’s current estimate call for all-flash storage revenue to grow by 17% y/y in \n2019  (vs.  +45%  y/y  in  2018),  down  slightly  from  the  prior  forecast  of  +18%  y/y.  IDC  estimates  all-\nflash revenue growing at an 11% CAGR 2018-2023.  \n \n\uf0b7  The firm increased its hybrid storage forecast by 9% since its December forecast, estimating -5% y/y in \n2019  (vs.  prior  estimate  -10%  y/y).  IDC  increased  its  hybrid  storage  estimates  on  average  by  16% \nacross the forecast period; however, IDC continues to expect hybrid storage will decline at a 2% CAGR \n2018-2023. The firm estimates hybrid storage will account for 35% of total storage in 2023, down from \n40% in 2018, respectively.   \n \n\uf0b7  IDC’s disk-only storage estimate for 2019 increased by 2%, estimating a 19% y/y decline (vs. -11% y/y \nin  2018).  However,  IDC  did  increase  its  all-disk  storage  forecast  by  an  average  of  13%  across  the \nforecast  period  –  estimating  revenue  to  decline  at  a  11%  CAGR  2017-2022  compared  to  the  prior \nforecast of a 15% decline, respectively.  \n \n\uf0b7  IDC slightly revised upward its external storage capacity shipped forecast  – now estimating 73.8PBs of \ncapacity in 2019, up 10% y/y (vs. +21% y/y in 2018). IDC increased its capacity shipped forecast by an \naverage of 1.3% across the forecast period and at a similar rate as the increase seen in December. The \nfirm estimates total external storage capacity shipped to grow at a 15% CAGR 2018-2023.  \n \nWells Fargo Securities, LLC | 13 \nIT Hardware & Communications Networking \nEquity Research \n \n \n \n14 | Wells Fargo Securities, LLC \n \n \nWells Fargo Technology Weekly \n \nEquity Research \nResearchers Test Optane Memory Performance; Positive Results vs. Conventional SSDs \nNextplatform published an article outlining a study by team of researchers at the University of California at \nSan Diego which measured the performance of Optane DC memory against conventional memory using a \nvariety of memory and I/O benchmarks. Researchers operated the Optane as both  main memory and as \nin-memory file store (Optane is able to save contents when power is switched off). The Optane performed \nwell as main memory, with only a 8.6% throughput decrease for the Memcached benchmark and a 19.2% \nfor the Redis benchmark when used as memory with the DRAM as a cache. The researchers reported that \nwhen  caching  was  disabled,  thus  not  allowing  DRAM  to  compensate  for  the  comparatively  slow  Optane, \nperformance  performance  decreased  by  20.1%  and  23%,  respectively.  The  study  supported  Optane’s \nperformance over conventional SSDs and highlighted their potential for large memory footprints currently \nsupplied  by  DRAM.  NextPlatform’s  Michael  Feldman  was  more  cautious  in  Optane  DC’s  potential  for  in-\nmemory store device replacements, noting that the researchers’ results showed a sensitivity to application \ntype and degrees of software optimization.  \n \nOptane vs. DRAM Benchmarks \nSource: University of California at San Diego \n \n \nOptane vs. Conventional and Optane SSDs \n \nSource: University of California at San Diego \n \n \n \nWells Fargo Securities, LLC | 15 \nIT Hardware & Communications Networking \nEquity Research \nDigiTimes Highlight TSMC Momentum from Chinese IC Market  \nDigiTimes reported this week that TSMC is seeing increased momentum in foundry demand driven by AI, \nIoT, and 5G chips from China’s IC design sector and global technology companies. The article notes that \nTSMC’s  7nm  is  driving  demand  as  China’s  domestic  efforts  are  still  having  difficulty  matching  TSMC’s \n16nm process technology.  \n \nA  separate  article  highlighted  that  TSMC’s  new  8-inch  wafer  fab  built  in  Tainan  will  have  its  capacity \nmostly filled as it is seeing robust orders for automotive chips from STMicroelectronics and other dedicated \nchipmakers. Automotive chip foundry is expected to become a major growth driver for TSMC in after 2020 \nas  it  is  seeing  many  customers,  including  NVIDIA  and  Qualcomm  push  into  the  automotive  segment.  It \nwas  noted  that  Taiwan’s  Vanguard  International  Semiconductor,  United  Microelectronics  (UMC),  and \nChina’s SMIC have also announced plans to expand their 8-inch foundry capacities.  \n \nSamsung Pulls Forward Pyeongtaek Fab Expansion by 3 Months? Targeting Production in March \n2020 \nIt was reported by The Investor that Samsung will likely look to commence operations of its second NAND \nFlash fab in Pyeongtaek in March 2020, which was originally planned for June 2020. The article notes that \nSamsung is preparing this new fab for production with expectations of a recovery in demand in 2020. It is \nexpected  that  Samsung  will  build  two  additional  fabs  within  this  complex,  in  which  the  company  will \nannounce plans for the next construction soon.  \n \nFifteen  New 300mm Semi Fabs to  Open  in 2019  and 2020; 138 300mm Fabs  in  Operations  by \n2023 vs. 112 Exiting 2018 \nDigiTimes this week highlighted the expectation of nine new 300mm semiconductor fabs to open in 2019 – \nreferencing research from IC Insights.  Of the nine 300mm (12-inch) fabs expected to open in 2019, five \nare  located  in  China.    The  article  also points out  that  nine  new fabs would  represent  the  most  new  fabs \nopen since 2017 with twelve new opens. There are currently six new fabs expected to open in 2020.   IC \nInsights  estimates  that  there  were  112  production-class  300mm  fabs  globally  as  of  the  end  of  2018, \nexcluding R&D fabs  and  non-IC generating  operations.   IC  Insights  estimates  that  there  will  be 26  more \nfabs in operations by 2023 when compared to the number of fabs exiting 2018.    \n \nDARPA Launches Machine Learning Processor Initiative with the National Science Foundation \nThe  Defense  Advanced  Research  Projects  Agency  (DARPA)  and  the  National  Science  Foundation  jointly \nannounced  a  project  to  develop  “foundational  breakthroughs”  in  machine  learning  hardware  in  the \nagency’s  latest  push  to  develop  next-generation  semiconductor  hardware.  The  project  will  first  seek  to \ndevelop  a  machine  learning  hardware  compiler  used  to  create  advanced  ML  algorithms  and  networks \nbased  on  current  ML  programming  frameworks.  A  second  phase  of  the  project  will  focus  on  hardware \noptimization.  DARPA  plans  to  release  a version of  the planned real-time  processor  in 1H21  with  planned \nworking silicon at the end of the initiative’s 3 year life.  \n \nTencent Gaming Revenue Growth Stalls as China’s Video Game Slows Title Releases \nTencent  reported  weakness  in  its  smartphone  and  PC  gaming  businesses  last  week  as  the  Chinese \ngovernment’s  crackdown  on video  games  slowed  game  releases.  As we  have  previously  noted, we  think \nthat  China’s  crackdown  on  video  games  is  a  potentially  underappreciated  risk  to  Apple’s  services \nmomentum with a significant portion of the company’s App Store revenue coming from APAC gaming.  As \na reminder,  IDC recently forecasted direct game spending and in-game ad revenue on smartphones and \ntablets  will  grow  12.6%  y/y  in  2019  vs.  +14.9%  y/y  in  2018  with  a  slowdown  in  China  as  a  significant \ncontributor.  \n \n\uf0b7  Smartphone  Gaming:  Revenue  of  ~19B  RMB  ($2.84B),  down  2%  sequentially  but  up  12%  y/y.  The \ncompany noted that it has received 7 game approvals from the government since it resumed approvals \nin  December.  The  company  expects  the  pace  of  its  game  releases  to  slow  as  the  government  faces  a \nlarge approval backlog.  \n \n\uf0b7  PC  Client  Games:  Revenue  of  11.2B  RMB  ($1.67B),  down  10%  sequentially  and  13%  y/y.  The \ncompany  noted  that  its  PC  gaming  business  was  affected  by  consumers  moving  to  mobile  as  well  as \nseasonality.  The  company  positively  mentioned  the  potential  for  cloud  gaming;  noting  that  it  would \nallow low-end PC and TV screen users the ability to play high-end games. Management said that China \nhas  a  unique  market  in  that  it  has  strong  bandwidth  infrastructure  but  a  low-end  PC  and  smartphone \nmix compared to developed countries. The company envisions support for single-player games followed \nby multi-player games as latency concerns are alleviated.  \n \n16 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \nEquity Research \n \n \nIDC Forecasts Wearable Device Shipments to Grow at 8.9% CAGR Through 2023 \nIDC  released  its  worldwide  market  forecast  for  the  wearable  device  market,  predicting  that  the  total \nmarket inclusive of watches, earwear, wristbands, and clothing will grow 15.3% in 2019 to 198.5M units. \nIDC forecasts an 8.9% CAGR for wearable devices through 2023 with total shipments hitting 279M driven \nby growth in earwear and watches. Key data from the report is listed below: \n \n\uf0b7  Watches:  Expected  to  hit  90.6M  units  in  2019  and  to  grow  at  a  9.7%  CAGR  through  2023.  Watches \nrunning  WatchOS  (Apple),  WearOS,  and  other  forked  versions  of  Android  will  account  for  ~28%  of  all \nwatches shipped in 2023.  \n \n\uf0b7  Earwear:  Projected  to  ship  54.4M  units  in  2019  and  grow  at  a  12.3%  CAGR  through  2023.  Earwear \ngrowth will be driven by the inclusion of biometric sensors and the adoption of smart assistants. \n \n\uf0b7  Wristband:  Expected to ship 49M units in 2019 but experience flat growth through 2023 with a 0.7% \nCAGR. Revenue value is expected to decline at a 4.1% CAGR with ASPs decreasing from $51 in 2019 to \n$42 in 2023.  \n \n\uf0b7  Clothing:  The  smallest  segment  at  3M  units  shipped  in  2019  but  expected  to  grow  at  a  30.2%  CAGR \nthrough  2023.  Connected  clothing  is  primarily  composed  of  step-counting  shoes  which  are  popular  in \nChina but have seen some interest at large American companies like Nike and Under Armour.  \n \n                      Source: IDC \n \n \n \nWells Fargo Securities, LLC | 17 \nIT Hardware & Communications Networking \nEquity Research \nEpsilon Theory Considers the Differing Media Narratives on AI in the U.S. & China \nEpsilon Theory published an interesting piece last week on the differing media narratives around artificial \nintelligence  in  China  and  the United  States.  Epsilon  Theory  uses proprietary software  to  deconstruct  and \nmap media narratives in clusters – the charts below show the detailed breakdown of over 2,000 AI articles \nin  the  two  countries.  As  the  analyzed  Chinese  media  is  English-language  (largely  state  driven)  we  think \nthat  it  could  provide  some  insight  into  where  the  Chinese  government  is  going  with  AI.  Some  notable \nnarrative clusters in the China graph include the country’s struggles around getting foreign capital to fund \nAI startups, Made in China 2025, and the idea that the U.S. is outpacing China in AI. The article also notes \nthat the Chinese media around AI uses similar language, suggesting a coordinated effort to push AI. We \nthink  that  China  will  continue  to  invest  heavily  in  AI  hardware  startups  such  as  Horizon  Robotics  and \nCambricon  that  we  recently  highlighted  in  our  AI  Semiconductor  whitepaper  (see  our  note:  Cutting  the \nVon Neumann Knot). \n \nChinese English-Language Media AI Narrative \nSource: Epsilon Theory; Quid \nAmerican Media AI Narrative \n \n                       Source: Epsilon Theory; Quid \n \n18 | Wells Fargo Securities, LLC \n \n \nWells Fargo Technology Weekly \n \nEquity Research \n \nInk World Magazine  Forecasts  Raw Material  Price Increase and  Higher  Regulator  Pressure  for \nPigment Manufacturers in 2019 \nInk  World  Magazine  released  a  detailed  article  highlighting  a  variety  of  difficulties  that  pigment \nmanufacturers will face in 2019, including decreased availability and increased prices of raw materials and \nhigher  regulatory  pressure,  especially  in  China.  The  report  included  several  industry  demand  forecasts \nincluding The Freedonia Group, which predicts that global demand for dyes and organic pigments will grow \n6% in 2019 to $19.5B, Grand View Research, predicting a 5.8% CAGR from 2017-2025, and a report from \nMarketsandMarkets that estimated a global CAGR for dyes and pigments of 5% with a total market value \nof $42B in 2021. All of the reports stressed the importance of titanium dioxide (TiO2), which accounts for \n~60% of total pigment demand. Other key points from the article include: \n \n\uf0b7  Continued Regulatory Pressure in China: As we have previously written, China has cracked down on \nits domestic ink industry as it works to reduce the industry’s environmental impact in the country. The \nregulatory  oversight  has  resulted  in  several  factory  closures  and  tighter  supply.  The  country’s  Green \nGDP initiative to improve  water and air quality have contributed to the unavailability of several critical \nraw materials.  \n \n\uf0b7  Raw  Material  Pricing:  Suppliers  have  noted  the  continued  increase  in  the  prices  of  critical  raw \nmaterials, especially those sourced from China. Shortages of feedstock in China have helped add to the \nissue. Prices for colored pigment have been especially affected. Carbon black prices have been driven up \ndue  to  a  weakening  of  China’s  steel  industry,  coking  demand,  and  limited  crude  tar  availability, \naccording to Dr. Sanjay Monie, a marketing manager at Orion Engineered Carbons.  \n \n\uf0b7  Difficulties  in  Forecasting  Just  in  Time  Inventory:  Pigment  manufacturers  have  had  difficulty \nkeeping up with just-in-time inventory requests and have struggled to accurately forecast demand.  \nWells Fargo Securities, LLC | 19 \n \nCoverage / Valuation Summary: \n \n2\n0\n \n|\n \nW\ne\nl\nl\ns\n \nF\na\nr\ng\no\n \nS\ne\nc\nu\nr\ni\nt\ni\ne\ns\n,\n \nL\nL\nC\n \n \n \n \nI\nT\n \nH\na\nr\nd\nw\na\nr\ne\n \n&\n \nC\no\nm\nm\nu\nn\ni\nc\na\nt\ni\no\nn\ns\n \nN\ne\nt\nw\no\nr\nk\nn\ng\n \ni\n \nE\nq\nu\ni\nt\ny\n \nR\ne\ns\ne\na\nr\nc\nh\n \n \n \nWells Fargo Technology Weekly \n \nEquity Research \nNoteworthy Company-Specific News/Thoughts Review: \n \n \nApple \n \n \n(AAPL – $192.99 – Market Perform) \n \n \nApple Introduces New AirPods & iPad Air – Refreshes iPad Mini and iMac \nLeading  up  to  Apple’s  March  25th  Media  Event  that  is  widely  anticipated  to  focus  on  services,  Apple \nintroduced new hardware including, second generation AirPods, an all-new iPad Air, as well as refreshes of \nthe existing iPad Mini and iMac products. Below we provide a summary of the new announcements. \n \n\uf0b7  iPad  Air:  Apple  introduced  the  latest  addition  to  its  iPad  product  portfolio  -  the  10.5  inch  iPad  Air. \nPositioned between the iPad Pro and the standard 9.7 inch iPad, compared to the iPad the Air offers a \n70% boost in performance, twice the graphics capability, and a 20% larger Retina display with 500k+ \nmore  pixels  (2224  x  1668  resolution;  264  ppi).  It  features  Apple’s  A12  Bionic  chip  (7nm),  Neural \nEngine, and M12 coprocessor, as well as a 7MP FaceTime camera and an 8MP rear camera. The device \nis compatible with the first generation Apple pencil, but not the 2nd generation Pencil that is magnetically \nattached and supports wireless charging (only for iPad Pro). The new iPad Air starts at $499 for the Wi-\nFi models and is available with 64GB or 256GB of storage capacity. \n \n\uf0b7  AirPods:  The  recently  introduced  second  generation  AirPods  feature  Apple’s  new  H1  chip,  which  is \nspecifically  designed  for  headphones  and  enables  faster  connect  times,  50%  more  talk  time  (extra \nhour),  and  the  optional  “Hey  Siri”  feature.  Notably,  the  new  AirPods  ship  with  a  wireless  charging \nenabled  battery  case  that  works  with  Qi-compatible  charging  solutions.  The  new  AirPods  with  the \nwireless charging case will be available for $199, and existing AirPods users can purchase a standalone \nwireless charging case for $79. Apple will continue to sell the AirPods with the standard charging case \nfor $159 (unchanged). \n \n\uf0b7  iPad Mini: Apple also refreshed the 7.9-inch iPad Mini, which now features the A12 Bionic Chip as well \nas first generation Apple Pencil support. The New Mini delivers 3x the performance and 9x faster raphics \ncompared  to  the  prior  generation.  The  display  is  also  25%  brighter  and  features  the  highest  pixel \ndensity of any iPad (326 ppi; 2048 x 1536 resolution). The iPad Mini starts at $399 for the Wi-Fi models \nand is available with 64GB or 256GB of storage capacity. \n \n\uf0b7  iMac: Apple’s updated iMac line now features up to an 8-core Intel 9th generation processor and Radeon \nVega  graphics  options  from  AMD.  The  21.5-inch  model  features  8th  generation  quad-core,  and  for  the \nfirst time 6-core processors, which deliver up to 60% faster performance compared to the prior model \nusing  the  high  spec  3.2GHz  6-core  i7  model  (turbo  boost  up  to  4.6GHz).  While  the  27-inch  model \nfeatures Intel’s latest 9th generation 6-core or 8-core processors, which deliver 2.4x faster performance \ncompared to the prior model using the top of the line 3.6GHz 8-core i9 model (turbo boost up to 5GHz). \nOn the graphics front, the 21.5-inch iMacs can now be configured with a Radeon Pro Vega 20 graphics \ncard  with  4GB  of  VRAM  (up  to  80%  faster  than  prior  model),  while  the  27-inch  models  can  be \nconfigured with a Vega 48 with 8GB of VRAM (up to 50% faster than prior model). The new 21.5-inch \niMac starts at $1,299 and the new 27-inch starts at $1,799.  \n \nIn  addition  to  the  above  announcements,  we  would  note  that  our  tracking  found  Apple  has  quietly \nupdated the iMac Pro with expanded high-end DRAM and AMD graphics card options. The top of the line \niMac  Pro  now  features  up  to  256GB  of  RAM  (2,666  MHz  DDR4  ECC;  $5,200  upgrade)  and  up  to  a \nRadeon  Pro  Vega  64X  graphics  card  with  16GB  of  HBM2  memory  ($700  upgrade).  Additionally,  Apple \nrecently lowered the prices of some existing memory and storage upgrade options across its Mac line-up \nby up to $400 amid the ongoing price declines in both DRAM and NAND. \n \nXiaomi’s Smartphone Revenue Grows 41% Y/Y; Reports Strength in Chinese Market \nXiaomi  reported  strong  smartphone  4Q18  results  with  total  revenue  of  174.9B  RMB  +52.6%  y/y  with \ninternational  revenue  of  70B  RMB,  up  118%  y/y.  Xiaomi’s  total  smartphone  shipments  were  118.7M  for \nthe fiscal year, +29.8% y/y compared to a 4.1% y/y for global smartphones. Notably, Xiaomi increased its \nASPs  in  China  by  17%  y/y  in  an  otherwise  soft  China  market  while  its  international  ASPs  were  up  10% \ny/y. According to 4Q18 IDC data, Xiaomi had a 9.8% ship share in Greater China vs. a 13.5% ship share \nin  the  year  ago  period.  The  company  has  continued  to  see  strength  in  India,  with  a  28.9%  ship  share \nwhich compares to the #2 vendor Samsung at a 18.7% share. We continue to see the strength of Chinese \nvendors as a threat to Apple in the Chinese mid-high range smartphone market.  \n \nWells Fargo Securities, LLC | 21 \nIT Hardware & Communications Networking \nEquity Research \n \n \n \nNVIDIA \n \n \n(NVDA – $179.10– Outperform) \n \n \n \nAWS Offering New G4 Compute Instances using NVIDIA Turing-based T4 GPUs for Inferencing \nIn  conjunction  with  NVIDIA’s  GPU  Technology  Conference  (GTC)  held  this  week,  Amazon  Web  Services \n(AWS)  announced  the  availability  of  the  company’s  new  G4  Instances  running  in  the  Elastic  Compute \nCloud (EC2) based on NVIDIA’s Turing T4 GPUs for AI inferencing workloads.   The NVIDIA Tesla T4 GPUs \nrepresent  NVIDIA’s  efforts  to  penetrate  what  has  thus  far  largely  been  x86  based  AI  inferencing \ndeployments; NVIDIA having a dominant positioning for GPUs in AI Training.  The T4 GPUs include 2,560 \nCUDA  cores  and  320  Tensor  cores  to  provide  up  to  8.1  teraflops  of  single  precision  performance  (up  to \n130 and 260 TFLOPS with INT8 and INT8, respectively).   The G4 instances will use custom Intel CPUs (4 \nto 6 vCPUs) and u to eight T4 GPUs, along with 384GiB of DRAM memory and 1.8TB of NVMe-based Flash \nstorage. AWS’ deployment of the T4 GPUs follows Google’s use of the GPUs currently in beta access.  \n \n22 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \nEquity Research \nMellanox  &  NVIDIA  Highlight  Record  Performance  for  HDR  200G  InfiniBand  +  SHARP \nTechnology \nEarlier this week Mellanox and NVIDIA announced that Mellanox’s HDR 200G InfiniBand with the Scalable \nHierarchical Aggregation and Reduction Protocol (SHARP) technology with NVIDIA 100 Tensor Core GPUs \nhas  set  new  performance  records, doubling deep  learning operations  performance.  The companies  noted \nthat this test included the Mellanox HDR InfiniBand Quantum connecting four system hosts, each including \n8  NVIDIA  V100  Tensor  Core  GPUs  with  NVLink  interconnect  technology,  and  a  single  ConnectX-6  HDR \nadapter per host – achieving an effective reduction bandwidth of 19.6GB/s by integrating SHARP’s native \nstreaming aggregation capability with NVIDIA’s latest NCCL 2.4 library. It was highlighted that in the most \ncommon  setup  for  this  configuration,  four  HCAs  in  each  system  host  are  used for  balanced performance \nacross a variety of workloads, where the initial SHARP and NCCL results yielded an expected 70.3GB/s.  \n \nPositive Investor Day - Gaming Inventory Burn, GFN Alliance, & More \n\uf0b7  WELLS’  CALL  –  POSITIVE  REVIEW  OF  NVIDIA’S  PLATFORM-DRIVEN  FOCUS  / \nDIFFERENTIATION; NO ESTIMATE CHANGES:  NVIDIA hosted a well-attended Investor Day held in \nconjunction with NVIDIA’s  GPU Technology Conference (GTC).  While we are  not changing our forward \nestimates  (note:  NVIDIA’s  Investor  Days  have  not  historically  provided  any  forward-looking  financial \nmodel  targets  /  commentary),  and  we  can  appreciate  the  fact  that  NVIDIA  shares  are  not  cheap  (26x \nP/E  and  21x  EV/EBITDA  on  our  C2020  estimates),  we  think  the  company  provided  what  should  be \nviewed  as  a  positive  review  of  its  domain-focused  platform  positioning  and  improved  fundamental \ndrivers  looking  forward.  A  few  of  the  most  notable  takeaways,  in  our  opinion,  include:  (1)  Positive \nRTX  upgrade  cycle  commentary  –  With  a  reiteration  that  expects  a  return  to  normalized  gaming \ninventory  exiting  F1Q20,  coupled  with  the  Turing  portfolio  availability  (entry  to  high-end),  NVIDIA \npositively disclosed that Turing RTX has seen a 45% higher sell-thru vs. Pascal at this point in product \ncycle  (~8  weeks  sell-thru  at  $299+  price  points);  ~48%  of  installed  base  at  pre-Pascal  GPUs.  (2) \nGeForce NOW Alliance = NVIDIA Revenue Sharing Opportunity for Game Streaming (5G Play) \n– NVIDIA GeForce NOW Alliance is a cloud-based game streaming partnership opportunity with service \nproviders and a NVIDIA revenue sharing model (a 5G play). NVIDIA announced initial partnerships with \nSoftBank  and  LG  U+;  service  availability  expected  in  2H2019.   The  company  believes  it  can  have  at \nleast  one partner  in  each  country.   (3)  Datacenter –  no big  surprises  /  model  update  drivers,  in  our \nopinion; NVIDIA outlined a datacenter TAM expanding from ~$37B in 2018 to estimated $50B by 2023.  \nThe  company  also  emphasized  its  datacenter  CUDA  ecosystem  expansion  (1.2M  developers,  600+ \naccelerated apps, etc.).   \n \n\uf0b7  DATA  CENTER:  $50B  TAM  by  2023.  NVIDIA  provided  a  positive  overview  of  the  company’s  data \ncenter  opportunity  with  the  company  estimating  an  addressable  opportunity  of  ~$50  billion  by  2023. \nNVIDIA pointed to continued strong growth in hyperscale AI training workload requirements – continued \nupward  trend  in  petaflops  /  day  with  a  continued  expansion  in  datasets.  More  importantly,  NVIDIA \nbelieves  it  is  at  the  early  stages  of  accelerating  adoption  for  AI  Inferencing  (a  continued  key  topic  of \ndebate among investors, in our opinion), incrementally reporting that in F2019 the company had a  few \nhundreds  of  millions  of  AI  inferencing  revenue.  In  enterprise,  NVIDIA  disclosed  that  it  has  seen  3.5x \nDGX revenue growth y/y for deep learning; 1,000+ DGX customer deployments. \n \n\uf0b7  GAMING: NVIDIA highlighted that its Turing gaming GPUs are off to a positive start (Sell-thru of $299+ \nTuring  GPUs  are  45%  higher  than  Pascal  ~8  wks.  post  launch)  and  reiterated  its  confidence  in  Ray \nTracing with roughly a dozen of games supporting RTX expected in 2019; focus on acceleration driven \nby  expanding  support  from  numerous  game  engines  (notably  Unreal  &  Unity).  The  company  remains \nconfident  in  secular  gaming  trends,  including  eSports  momentum  (moving  toward  500M+  gamers;  up \nfrom 280M in 2016 and ~400M in 2018), and growth in casual gamers who start at a younger age and \ncontinue  gaming  longer  into  their  lives.  NVIDIA  noted  that  the  crypto  inventory  flush  is  on  track  and \ncontinues to expect normalization exiting F1Q20. \n \nPlease see our detailed report published on 3/18/19 for additional information. \n \nJensen Huang (CEO) Keynote @ GTC - Quick Notes / Thoughts \n\uf0b7  WELLS’  CALL:  We  attended  Jensen  Huang’s,  (NVIDIA  Founder  &  CEO),  keynote  presentation  at  GTC. \nMr.  Huang,  as  expected,  positively  highlighted  NVIDIA’s  competitive  differentiation  /  ecosystem \nexpansion  (i.e.,  its  more  than  just  chip  development).  With  a  positive view on  NVIDIA’s  acquisition  of \nMellanox (deepening datacenter strategy / synergies), we increase our price target to $190 (was $170). \nIn  terms  of  announcements,  we  would  highlight:  (1)  New  Data  Science  Server;  No  Updates  to \nDGX-2:  NVIDIA  introduced  a  new  Data  Science  Server  positioned  toward  hyperscale  (scale-out) \narchitectural approaches for data science. The Data Science Servers integrate four T4 GPUs in a server \nw/ 64GB of GDDR6 memory and Mellanox or Broadcom Ethernet; 260 teraflops of performance (FP 16), \n(2) Omniverse Intro: Open collaboration platform for global animation studio workflow, (3) GeForce \nNOW Alliance: GeForce NOW Alliance is NVIDIA’s collaboration initiative working with telecom / other \nservice  providers  for  on-line  game  streaming  over  5G  networks;  announced  initial  partnerships  with \nSoftBank  and  LG  U+.  Below  we  briefly  summarize  what  we  think  the  most  interesting  /  incremental \ntakeaways from the keynote presentation include:  \n \nWells Fargo Securities, LLC | 23 \nIT Hardware & Communications Networking \nEquity Research \n\uf0b7  NVIDIA’s Expanding Datacenter  Systems  Strategy w/  New Data Science Servers  (4 x  Turing \nT4  GPUs):  Given  investor  focus  on  NVIDIA’s  $6.9B  acquisition  of  Mellanox,  we  think  the  company’s \ndatacenter-focused  announcements  are  the  most  notable  focus.  Mr.  Huang  provided  a  simplified \ncomparison between supercomputing and hyperscale computing, coupled with where the company sees \ndata  science  residing  between  these  scale-up  vs.  scale-out  architecture.  NVIDIA  pointed  out  that  the \ncompany’s  DGX-  2  systems  are  positioned  toward  supercomputing,  while  the  company’s  new  Data \nScience  Servers  are  positioned  for  scale-out  architectures  toward  the  hyperscale  market.  The  Data \nScience  Servers  include  4  x  Turing  T4  GPUs with  64GB  of GDDR6 memory,  connected  via  Mellanox or \nBroadcom  Ethernet  NICs  (Wells’  Note:  we  will  be  interested  in  NVIDIA’s  leverage  of  Mellanox’s  new \nBlueField-based  SmartNICs)  and  resulting  in  260  teraflops  of  performance  (FP  16).  The  new  Data \nScience  Servers  will  be  offered  as  a  NGC-Ready  validated  solution  by  Cisco,  Dell  EMC,  Fujitsu,  HPE, \nInspur,  Lenovo,  and  Sugon.  As  a  reminder,  the  DGX-2  servers  integrate  16  x  Tesla  V100  GPUs  with \n512GB of HBM2 memory and 8 x Mellanox InfiniBand switches. NVIDIA also emphasized the importance \nof  RAPIDS  –  NVIDIA  introduced  RAPIDS  in  October  2018  as  an  open-source  software  library  for  AI. \nNVIDIA  highlighted  RAPIDS  support  at  Microsoft  Azure,  Google  Cloud,  and  Databricks,  as  well  as \npartnership alignment with Accenture Digital.  \n \n\uf0b7  NVIDIA  Ecosystem  Expansion:  NVIDIA  Ecosystem  Expansion:  NVIDIA  reiterated  its  platform \nexpansion  by  noting  that  it  now  has  1.2M  CUDA  developers,  up  from  ~440k  and  ~770k  reported  in \nMarch  2017  and  2018,  respectively.  The  company  also  highlighted  that  CUDA  now  supports  600+ \naccelerated  applications,  up  from  ~400  and  550  accelerated  applications  reported  in  March  2017  and \n2018, respectively. NVIDIA also disclosed that it has had 13M+ CUDA downloads, an increase from 8M \ndownloads last year.  \n \n\uf0b7  Gaming: NVIDIA reported that it saw ~50% y/y growth in its notebook GPU revenue in 2018; 40 new \nnotebooks  using  Turing  GPUs  will  become  available  in  2019.  In  gaming,  NVIDIA  also  disclosed  that \nTuring  RTX  would  be  available  with  Unity  game  engine  starting  on  April  4th,  as  well  as  support  with \nMicrosoft DirectX (application programming interface for game development on Windows and Microsoft \nXbox),  Unreal  Engine  (game  engine  developed  by  Epic  Games),  and  Vulkan  (rendering  engine  that \nprovides software abstractions of the GPUs). NVIDIA reported that Turing RTX comes with 9 million 3D \ncreators  in  2019.  (1  million  architects,  3  million  designers,  3  million  3D  artists,  and  2  million  M&E \nprofessionals).  Over  80% of  the  world’s  leading graphics  tool  makers  working with  Turing  RTX  (100% \nexpected by the end of 2019).  \n \n\uf0b7  NVIDIA Intros Omniverse: NVIDIA introduced Omniverse – an open collaboration platform positioned \nto  simplify  studio  workflows  for  real-time  graphics.  NVIDIA  highlighted  this  as  becoming  increasingly \nimportant  as  there  are  over  200  animation  studios  globally  that  are  increasingly  collaborating  on  end \nanimations. OMNIVERSE runs on a local workstation, in a datacenter, or in the pubic cloud. Omniverse \nis currently in early access availability.  \n \n\uf0b7  NVIDIA Intros GeForce NOW Alliance – RTX Server Deployments in Telecom SPs for 5G-Based \nGame  Streaming  (Mellanox  Acquisition  Increasingly  Importance).  GeForce  NOW  is  a  cloud-\nbased  open  gaming  platform;  a  GeForce  PC  in  the  cloud.  NVIDIA  announced  that  GeForce  NOW \ncurrently  includes 500+  games  supported  across  15  datacenters  and  now  includes 300,000 players  (1 \nmillion  gamers  are  currently  on  waiting  list).  Given  the  latency  attributes  of  game  streaming  and  the \nneed  for  datacenter  expansion,  NVIDIA  announced  that  it  has  created  the  GeForce  NOW  Alliance  - \nNVIDIA  working  with  telecom  providers  that  are  interested  in  enabling  game  streaming  over  their  5G \nnetworks. The first two partners announced include SoftBank in Japan and LG U+. NVIDIA RTX servers \nwill be hosted in these data centers – RTX Servers including support for up to 40 Turing GPUs in an 8U \nplatform; optimized for end-to-end stack for rendering, remote workstation, and cloud gaming. NVIDIA \nis also supporting a Pod deployment – 32 RTX servers, with 1,280 RTX GPUs (Wells’ thought: we think \nthis  highlights  the  importance  of  Mellanox’s  low-latency  InfiniBand  and  Ethernet  connectivity \ncapabilities). NVIDA noted that a single Pod deployment can support up to 10,000 concurrent players.  \n \n\uf0b7  NVIDIA  RTX  Server  vs.  Intel  Dual  Skylake-based  Server:  NVIIDIA  highlighted  a  rendering \nperformance / cost comparison of a NVIDIA RTX server versus a dual-Skylake based server – showing a \nsingle node deployment vs. a required 25 node Intel-based deployment; power consumption at ~$10k \nvs. ~$70k (over 5-yrs), and a total cost at ~$30k vs. the Intel Skylake-based server at ~$250k.  \n \n\uf0b7  DRIVE  AP2X  Release 9.0; Toyota  End-to-End  Collaboration  for  Autonomous  Vehicles.:  NVIDIA \nemphasized  its  end-to-end  strategy  for  autonomous  vehicle  development  /  enablement  –  from  DGX \nSaturn V, Constellation (now available for datacenter deployment), and Xavier platforms, coupled with \nDRIVE AV, DRIVE IX, and KITT Reism. NVIDIA announced DRIVE AP2X Release 9.0 – a high function \nL2+  autopilot  system.  While  NVIDIA’s  demonstrations  were  impressive,  Mr.  Huang  that  we  are  still  a \ncouple of years away from having production vehicles in the market. NVIDIA announced that Toyota is \npartnering with NVIDIA for end-to-end autonomous vehicle development  \n \n\uf0b7  Jetson Nano for Robotics: NVIDIA introduced a new robotics computer – Jetson Nano. Jetson Nano is \npriced  at  $99  NVIDIA  and  included  support  for  the  entire  CUDA-X  AI  model  library.  The  company \nhighlighted  its  portfolio  for  robotics  with  KAYA  (Jetson  Nano),  CARTER  (Jetson  Xavier),  and  LINK \n(Multiple Jetson Xaviers).  \n24 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \n \n \nWestern Digital \n \n \n** See NAND Flash commentary above ** \n \n \nMicron Technology \n \n \nEquity Research \n(WDC – $49.15 – Outperform) \n \n \n \n(MU – $42.36 – Outperform) \n \nMicron Intros New NVMe PCIe Client SSD \nEarlier  this  week  Micron  announced  the  introduction  of  its  new  2200  PCIe  NVMe  SSD,  targeted  at  client \nPCs.  The  Micron  2200  PCIe  NVMe  SSD  was  noted  as  a  vertically  integrated  solution,  including  3D  TLC \nNAND, internally designed ASIC drive controller, and firmware in an M.2 form factor. This SSD was noted \nas delivering performance of up to 3GB/second sequential reads, 1.6GB/second sequential writes, 240,000 \ninputs/outputs  (IOPS)  random  reads,  and  210,000  IOPS  random  writes.  Capacity  points  for  the  2200 \nrange from 256GB through 1TB.  \n \nWe continue to be focused on Micron’s ability to maintain market share in the SSD market as it transitions \nits  SSD  portfolio  to  NVMe-based  solutions  throughout  2019.  As  a  reminder,  Micron  held  an  8%  revenue \nmarket share in the SSD market during 2018; 7% in client SSDs.  \n \nSamsung Highlights 1Znm 8Gb DDR4 DRAM Introduction w/o Use of EUV; Micron Commencing \n1Znm DRAM Sample Shipments \nThis week Samsung announced that it has developed a 3rd generation 10nm (1Znm) 8Gb DDR4 DRAM  – \nan  industry  first.  This  comes  sixteen  months  after  the  company  commenced  mass  production  of  the \ncompany’s  2nd  generation  10nm  (1Ynm)  8Gb  DDR4  solutions.    Samsung  reports  that  it  was  able  to \ndevelop 1Znm 8Gb DDR4 without the use of Extreme Ultra-Violet (EUV) processing.   Samsung points out \nthat  this  represents  the  industry’s  smallest  memory  process  node  and  is  positioned  to  cost  effectively \nmeet demand for new DDR4 DRAM with +20% higher manufacturing productivity compared to the 1Ynm \nDRAM.  Samsung  will  commence  mass  production  of  1Znm  8Gb  DDR4  solutions  in  2H2029  to \naccommodate enterprise servers and high-end PC demand to be launched in 2020. \n \nSamsung’s  announcement  notes  that  1Znm  DRAM  will  pave  the  roadmap  toward  DDR5,  LPDDR5,  and \nGDDR6  solutions;  subsequent  versions  of  1Znm  DRAM  solutions  will  drive  increased  capacities  and \nperformance.  Samsung  states  that  it  plans  to  increase  the  portion  of  its  main  memory  production  in  its \nPyeongtaek  fab  to  meet  rising  demand  for  next-generation  DRAM  solutions.    As  a  reminder,  Micron  had \nreported  this  week  that  it  was  seeing  good  yields  on  their  1Ynm  DRAM  with  an  expected  conversion  to \nmaterialize  over  the  coming  quarters.    The  company  also  noted  that  it  has  make  excellent  progress  in \n1Znm DRAM with initial sample product shipments commencing.  \n \n \nIntel \n \n \n(INTC – $53.52 – Outperform) \n \n \n** Intel Will Be Hosting Data-Centric Product Launch Event on April 2nd in San Francisco** \n \nHPCWire Interview with Jim Keller, Intel’s head of Silicon Engineering Group \n \nArticle Link: https://www.hpcwire.com/2019/03/21/interview-with-2019-person-to-watch-jim-keller/ \n \nHPCWire  published  an  interesting  piece  summarizing  a  recent  interview  of  Intel’s  Head  of  Silicon \nEngineering  Group,  Jim  Keller.    Mr.  Keller  has  been  at  Intel  for  the  past  year  and  interestingly  he \npreviously was a key developer for AMD’s Zen architecture, prior to joining Tesla.  Some of the takeaways \nfrom the interview include:  \n \n\uf0b7  Mr. Keller noted that Intel’s scale was one of the reasons he joined the company; the ability to leverage \nhis knowledge across a broadening range of applications.  \n \n\uf0b7  Thoughts  on  a  slowing  Moore’s  Law  –  Mr.  Keller  stated  that  he  is  not  concerned  about  Moore’s  Law \nslowing as Intel will elaborate its strategy to continue to drive performance enhancements overtime.   \n \n\uf0b7  Mr. Keller highlighted the evolution of Bell’s Law – transistor density yields computational intensity.  He \nnotes  that  the  industry  is  now  moving  from  scalar  computing  to  vector  to  matrix  to  spatial-based \ncomputing;  each  of  these  steps  have  been  quantum  leaps  in  the  use  of  transistors  for  increasingly \ncomplicated computational models.     \n \n \n \n \n \n \n \n \nAdvanced Micro Devices \n \n \n(AMD – $26.74 – Outperform) \n \n \nWells Fargo Securities, LLC | 25 \nIT Hardware & Communications Networking \nEquity Research \nGoogle Stadia (Project Stream; Game Streaming); Positive Confirmation on Usage of AMD GPUs \nand CPUs \nEarlier  this  week,  at  the  Game  Developers  Conference,  Google  introduced  its  new  (albeit  anticipated) \ngame streaming services called  Stadia (codenamed:  Project Stream). Shares of AMD went higher as this \nevent  provided  further  confirmation  /  confidence  in  AMD’s  positioning  at  Google  for  the  new  game \nstreaming services – i.e., Google utilizing AMD’s Radeon Pro CPUs and next-generation CPUs (7nm Rome \nEPYC CPUs).  Google highlighted Stadia as a centralized community for gamers, creators, and developers \nwith initial demonstrations running at 60fps at 4K with HDR and surround sound; 120fps support at 8K is \nplanned.  The service leverages Google’s datacenter footprint (7,500+ edge nodes) with content delivery \nnetworks  supporting  enough  low  latency  data  transfer  to  support  the  frame  rates  required  for  gaming.   \nThe end user can play on PCs, smart TVs, tablets, or phones.  Google will offer a custom Wi-Fi connected \ncontroller  and  integrates  savings  capabilities  into  YouTube.    The  controller  also  includes  an  integrated \nmicrophone for Google Assistant integration.   Google noted that  Stadia development has been going on \nfor  years.      Google  plans  to  launch  Stadia  in  2019  in  the  U.S.,  Canada,  and  most  of  Europe.    Pricing  is \nexpected to be announced in the summer of 2019.  \n \nFrom an AMD GPU / CPU perspective, AMD highlighted the GPU supporting 10.7 teraflops of performance \nwith  56  compute  units  and  HBM2  (high-bandwidth  memory).  The  CPU  was  noted  as  a  custom  x86 \nprocessor running at 2.7GHz with AVX2 support.   \n \n \nBroadcom \n \n(AVGO – $292.98 – Market Perform) \n \n \nBroadcom Intros New Automotive Multilayer Ethernet Switches \nThis  week  Broadcom  announced  the  introduction  of  its  BCM8956X  family  of  automotive  multilayer \nEthernet switches, which are targeted at addressing the growing need for bandwidth, flexibility, security, \nand time-sensitive networking (TSN) for autonomous and connected vehicles. Broadcom highlighted that \nthe  rapid  adoption  of  in-vehicle  electronics  and  increasing  bandwidth  for  data  intensive  applications  are \ndriving  the  need  for  Gigabit  Ethernet.  The  BCM8956X  includes  highly-optimized  switches  in  various  port \nconfigurations  with  integrated  100BASE-T1  and  1000BASE-T1  PHYs  to  enable  cost-effective  designs  for \nautomotive  gateway,  ADAS  and  infotainment  applications.  The  integrated  PCIe  interface  was  noted  as \nproviding  high  bandwidth  connectivity  to  the  host  processor,  while  the  on-chip  Layer  3  flow  accelerator \noffloads  the  host  processor  from  compute  intensive  routing  operations.  Broadcom  is  currently  sampling \nthe BCM8956X and shipping the BCM8988X to selected automotive OEMs and Tier 1 suppliers. \n \n \nSeagate Technology \n \n(STX – $47.57 – Market Perform) \n \n \nSeagate Aligned with HP Enterprise & NVIDIA for AI-Based Manufacturing \nThis  week  it  was  reported  that  Seagate  is  working  with  HP  Enterprise  and  NVIDIA  to  develop  an  AI-\ndedicated  platform  codenamed  Project  Athena  that  would  be  implemented  in  Seagate’s  manufacturing.  \nThe systems would enable a reported 20% reduction in clean room investments and manufacturing times \nby as much as 10%.  The company would deploy the systems in their manufacturing operations globally.     \nSeagate  has  reported  that  the  IP  for  the  development  of  the  machines  is  shared  across  the  three \ncompanies.    It  reports  that  Seagate’s  HDD  manufacturing  operations  involve  ~1,000  different  process \nsteps; implementing sensors on the systems creates a significant amount of valuable data correction that \nwill  be  stored  on  the  company’s  edge  cloud  and  utilized  for  enhanced  processes.      As  an  example,  the \narticle  notes  that  Project  Athena  can  record  several  million  microscopic  images  of  magnetic  heads \nproduced in the company’s factories each day.    \n \n \nCray \n \n(CRAY – $24.95 – Outperform) \n \n \n \nDepartment of Energy Announces $500M Contract for Cray / Intel Aurora \nCray  announced  that  the  U.S.  Department  of  Energy  (DOE)  would  deploy  the  world’s  first  exascale \nsupercomputer, the Cray and Intel built Aurora, to be delivered by the end of 2021 (acceptance in 2022) \nto  the  Argonne  National  Laboratory  (ANL).  The  announcement  valued  the  Aurora’s  contract  at  +$500M \nwith Cray’s portion valued at upwards of $100M (we would estimate $100- $150M). Cray has highlighted \nthat this contract is one of the largest in Cray’s history; the second major Shasta win following the NERSC \nPerlmutter  win  valued  at  $146  million.  Intel  will  serve  as  the  prime  contractor  for  the  system  (i.e., \nrecognizing revenue for CPUs, next-generation accelerators (GPUs), memory, and cabling. Cray will act as \na subcontractor providing advanced packaging, its Slingshot interconnect (note: initial Aurora system was \nslated  to  utilize  Intel’s  Omni-Path  interconnect),  software stack,  and  cabinet  /  cooling  infrastructure.  We \nexpect  to  see  meaningful  revenue  contribution  from  the  deal  in  the  2022  timeframe.  As  a  reminder, \nAurora  was  originally  planned  as  a  180  petaflop  pre-exascale  machine  (initial  cost  of  ~$200M)  to  begin \noperations in 2018, but was reimagined as an exascale machine following the delay of Intel’s  Knights Hill \nprocessors.  \n \nOur  Thoughts  –  Expected,  but  Positive  Validation  of  Cray’s  Positioning  for  Next-Gen \nSupercomputer  Deployments:  While  the  announcement  of  Cray  and  Intel’s  win  was  expected,  we \ncontinue  to  be  positive  around  the  company’s  ability  to  compete  for  additional  exascale  machines.  As  a \n26 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \nEquity Research \nreminder, the DOE announced in April 2018 that it planned to acquire up to three exascale systems at a \ncost  of  up  to  $1.8B  with  a  budget  range  of  $400M  to  $600M  per  system.  The  three  systems  include \nFrontier, an ORNL system expected to be delivered in 3Q21, El Capitan, expected in 3Q22 at LLNL, and a \npotential  third  unnamed  exascale  system  at  ANL  dependent  on  funding.  We  think  that  the  Aurora  and \nPermutter  wins  make  it  more  likely  a  Cray  architecture  is  selected  for  a  CORAL-2  exascale  system.  We \nwould  also  positively  highlight  the  importance  that  the  Trump  Administration  has  placed  on  exascale \ncomputing with the administration’s FY2020 DOE budget request asking $809M for exascale computing up \n27%  from  the  FY2019  request  and  +265%  from  the  FY2016  enacted  budget.  In  addition  to  the  major \nsupercomputing  funding  and  thus  opportunities,  we  believe  Cray  remains  well  positioned  for  the \nconvergence between Artificial Intelligence workload requirements and supercomputing architectures.  \n \n \nWells Fargo Securities, LLC | 27 \nIT Hardware & Communications Networking \nEquity Research \n                       Source: U.S. Department of Energy \n \n \nNetApp \n \n(NTAP – $67.57 – Market Perform) \n \n \n \nNetApp Job Listings Update \nNetApp’s  job  listings  totaled  425,  down  1  from  the  prior  week,  vs.  345  entering  2019.  The  company \ncurrently  has  77  sales  openings,  up  2  from  the  prior  week  and  compared  to  58  a  year  ago.  The  figure \nbelow highlights NetApp’s employee job listing trends over the past several years. \n \n \n \n28 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \n \n \nPure Storage \n \n \nEquity Research \n(PSTG – $20.74 – Outperform) \n \n \nPure Storage Introduces Hyperscale AIRI & FlashStack for AI \nPure Storage introduced new NVIDIA-based solutions that offer a complete portfolio of offerings for any AI \ninitiative – from early inception to large-scale production. The company announced a new hyperscale scale \nconfiguration  of  its  AI-Ready  Infrastructure  (AIRI)  as  well  as  FlashStack  for  AI.  AIRI  was  noted  as  built \njointly  with  NVIDIA  and  Mellanox  to  deliver  multiple  racks  of  NVIDIA  DGX-1  and  DGX-2  systems \nleveraging  InfiniBand  and  Ethernet  interconnect  options.  Pure  noted  that  this  solution  can  leverage \nNVIDIA  NGC  software  container  registry  and  AIRI  scaling  toolkit  to  enable  data  scientists  to  build \napplications with containerized AI frameworks and rededicate time to deriving valuable insights from data; \nincluding  support  for  Kubernetes  and  Pure  Service  Orchestrator.  FlashStack  for  AI  was  noted  as  jointly \nbuilt leveraging Cisco UCS C480ML, Pure Storage’s FlashBlade and NVIDA GPU CUDA to target enterprise \nAI workloads. \n \nPure Storage Job Listings Update \nPure’s  open  job  listings  totaled  389  exiting  the  week,  up  6  from  the  prior  week, with  sales  listings  up  4 \nfrom the prior week at 213 openings. Pure lists 73 Account Executive openings, down 10 from the \nprior week. We would note this includes 8 listings for dedicated FlashBlade Account Executives \n– down 1 from the prior week.  Pure reported adding ~150 employees sequentially in F4Q19 bringing \ntotal headcount to over 2,800.  \n \n \n \nNutanix \n \n(NTNX – $41.39 – Market Perform) \n \n \n \n \nInvestor Day Recap: C2021 Target Model Outlined; Positive Overview of Platform Vision \nWells Thoughts: Nutanix hosted what we would consider a net-positive Investor Day event. We remain \npositive  on  Nutanix’s  technology  differentiation  /  multi-cloud  platform  vision,  which  was  thoroughly \noutlined  throughout  the  event.  However,  after  providing  disappointing  F3Q19  (Apr  ’19)  guidance  amid  a \nweakening  forward  pipeline  build  attributed  to  slowing  marketing  spend  on  lead  generation  (note: \ncompany did state that it has seen an improvement in its opportunity pipeline over the past few weeks), \nwe  retain  a  cautious  /  prove-it  stance  on  this  momentum-driven  valuation-expansion  story  –  recovery  / \ngrowth  reacceleration  is  a  F2020  story.  We  think  the  most  notable  takeaways  from  the  Investor  Day \ninclude:  \n \n\uf0b7  $3B Billings Target Shifted Out by ~6-Mos (No Surprise); ~33% CAGR: Driven by consistent, and \nwe would assume relatively reasonable historically-based inputs vs. those outlined a year ago, Nutanix \nprovided a target build-up to a $3B billings level by calendar 2021 (vs. prior F2021; Jul ’21) – reflecting \nno  big  changes  in  avg.  deal  size  expectations,  repeat  purchase  multiples,  and  customer  expansion \n(~23,600 by C2021 vs. 12,410 exiting F2Q19).  \n \n\uf0b7  C2021  Target  Model:  Nutanix  outlined  a  C2021  target  model  with  +30%  billings  /  revenue  growth \n(subscription  at  ~75%  of  rev.  vs.  47%  in  F2Q19),  GM%  at  ~80%,  non-GAAP  EBIT%  at  0%-5%,  and \nFCF  at  ~10%  of  revenue.  While  we  do  not  currently  model  C2021,  we  would  note  that  we  currently \nmodel F2021 (Jul ’21) billings at ~$2.2B, revenue at ~$1.9B, GM% at 80.4%, and EBIT% at -11.9%.  \n \nWells Fargo Securities, LLC | 29 \nIT Hardware & Communications Networking \nEquity Research \n\uf0b7  Platform  TAM  /  Monetization  Opportunity  Expansion:  We  think  Nutanix  management  positively \noutlined  the  company’s  expanding  platform  addressable  market  opportunities  –  Essentials  and \nEnterprise  solutions  representing  and  incremental  $65B  TAM  opportunity.  The  company  noted  that  it \ncurrently sees ~90% of its revenue driven by its Core solutions. We continue to believe Nutanix’s ability \nto  monetize  up  the  stack  will  be  the  next  phase  of  the  company’s  hybrid  /  multi-cloud  platform  story \n(e.g.,  Xi  Cloud  Services  –  Beam,  Epoch,  Frame,  Leap,  etc.).  We  think  this  would  be  a  key  driver  of \nvaluation multiple expansion.  \n \n\uf0b7  Sales  Strategy:  Nutanix’s  incoming  Head  of  Americas  Sales,  Chris  Kaddaras,  laid  out  the  company’s \nstrategy  of  enhancing  the  company’s  sales  force  with  a  focus  on  segmentation  and  building  a \ndeep/qualified  sales  bench  over  the  next  year.  He  emphasized  his  focus  on  improving  the  company’s \npipeline  rigor  with  a  goal  of  a  3x  coverage  ratio  with  high  quality  (50%+  per  quarter  conversions). \nFinally, Mr. Kaddaras said Nutanix is pushing to industrialize a subscription transformation with a focus \non  customer  subscription  progress  and  strong  refresh  on  end-of-life  device  to  move  to  portable \nsubscription revenue. He also noted a large pipeline build in EMEA refresh opportunities over the last six \nmonths. \n \nPlease see our detailed report published on 3/20/19 for additional information.  \n \nNutanix Job Listings Update \nNutanix’s job listings totaled 578, up 20 from the prior week and vs. 503 a year ago.  We would note that \nour conversations with Nutanix suggest that there have been some changes in the way that they list job \nopenings that has caused the declines in July 2018 rather than any meaningful change in hiring plans. The \ncompany  currently  has  319  sales  openings,  up  6  from  the  prior  week.  Nutanix  has  125  openings  for \nengineers, up 5 from the prior week (vs. 118 a year ago), while support listings were up 1 at 32 openings.  \nAs a reminder, Nutanix reported that it had exited F2Q19 with 4,700 total employees including 2,209 S&M \nemployees,  up  from  2,102  and  1,606  in  the  prior  and  year-ago  quarters.  The  figure  below  highlights \nNutanix’s employee job listing trends over the past several years. \n \n \n \nCommvault \n \n \n(CVLT – $64.80 – Outperform) \n \n \n \nCommvault Job Listings Update \nCommvault’s  job  listings  totaled  75,  down  8  from  the  prior  week  and  vs.  64  entering  2019.  Commvault \nhad  21  sales  openings,  2  openings  in  professional  services,  and  12  openings  in  systems  engineering. \nThis compares to 26, 2, and 12 openings last week, respectively. Commvault exited the December quarter \nwith  total  headcount  at  2,576  down  from  2,644  in  the  prior  quarter.  The  company  has  reduced  its \nheadcount by 9%  since  the  beginning  of  2018  as  part of  its  Commvault  Advance restructuring program. \nWe think the company’s sales listings could be an important metric to track as Commvault implements a \ngo-to-market  realignment  –  e.g.,  transitioning  30%-40%  of  its  direct  sales  facing  resources  (+300 \nemployees) to channel / partnership-facing roles. The figures below  highlight Commvault’s employee job \nlisting trends over the past several years. \n \n30 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \nEquity Research \n \n \nHP Inc \n \n \n(HPQ – $19.63 – Market Perform) \n \n \n \nHP Introduces New Printer Solutions / Software & Security \nHP Inc. made several printing announcements at its Reinvent global partner event this week. Coming out \nof  the  January  quarter,  we  /  investors  will  continue  to  be  focused  on  HP’s  ability  to  stabilize  its  print \nsupplies business as it looks to right size its channel inventory; visibility into HP’s 4 box model remains  a \nkey focus.  \n \n\uf0b7  The company introduced new solutions in its multifunction printer lineup, including:  (1) LaserJet 600 \nSeries  (A4).  This  new  printer  includes  industry  first  innovations,  including  cartridge  access  control \n(ensures  all  toner  is  used  before  cartridge  is  replaced),  fixed  tray  guides  (helps  paper  jams),  and \npredictive  sensors  embedded  within  the  device  to  enable  SDS  (see  below).  (2)  LaserJet  700  &  800 \nSeries  (A3).  These  new  entry-level  models  were  noted  as  featuring  the  industry’s  strongest  print \nsecurity.  (3)  Energy  Efficient  LaserJet  400  /  500  Printers.  These  new  printers  were  noted  as \nutilizing  Original  HP  EcoSmart  black  toner  and  use  an  average  of  21%  less  energy  than  the  prior \ngeneration.  (4)  FutureSmart.  This  enhancement  across  enterprise  and  managed  printer  and \nmultifunction printers was noted as enabling new HP Custom Color Manager with RT preview to create \ncolor  adjustments.  It  also  includes  more  than  12  additional  new  features,  including  advanced  copier \nfeatures.  \n \n\uf0b7  Security:  HP  continues  to  highlight  the  importance  of  printer  security  –  noting  that  a  recent  survey \nfound  83%  of  respondents  had  secured  their  PCs,  and  55%  their  mobile  devices,  but  only  41%  had \nsecured  their  printers.  The  company  notes  that  it  is  the  only  vendor  with  SD-PAC  Certified  firmware, \nwhich  includes  HP  FutureSmart  firmware,  HP  JetAdvantage  Security  Manager,  HP  Access  Control,  HP \nJetAdvantage SecurePrint and HP JetAdvantage Insights. \n \n\uf0b7  HP  introduced  a  suite  of  enhanced  solutions  including:  (1)  JetAdvantage  Apps  –  Connected  to  HP \nmulti-function  printers,  these  pre-built  applications  provide  management  and  development  tools \nsupported  by  HP  Cloud  services.  HP  noted  that  it  has  35  beta  apps  available  today  and  over  200 \ndevelopment partners. (2) Smart Device Services (SDS) – These services can predict failures of key \ncomponents. HP noted that partners who have adopted the SDS solution are reporting average service \ncost reduction of 17% or more.  \nWells Fargo Securities, LLC | 31 \nIT Hardware & Communications Networking \nEquity Research \n \n \n \n \n \nRequired Disclosures \nThis is a compendium report, to view current important disclosures and other certain content related to the \nsecurities recommended in this publication, please go to https://www.wellsfargoresearch.com/Disclosures or \nsend an email to: equityresearch1@wellsfargo.com or a written request to Wells Fargo Securities Research \n \nPublications, 7 St. Paul Street, Baltimore, MD 21202. \n \n \n \nAdditional Information Available Upon Request \nI certify that: \n1) All views expressed in this research report accurately reflect my personal views about any and all of the subject securities \nor issuers discussed; and  \n2) No part of my compensation was, is, or will be, directly or indirectly, related to the specific recommendations or views \nexpressed by me in this research report. \n \n  \n \n \n \nWells Fargo  Securities,  LLC  does  not  compensate  its research  analysts  based on specific  investment banking  transactions. \nWells  Fargo  Securities,  LLC’s  research  analysts  receive  compensation  that  is  based  upon  and  impacted  by  the  overall \nprofitability and revenue of the firm, which includes, but is not limited to investment banking revenue. \n \nSTOCK RATING \n1=Outperform:  The  stock  appears  attractively  valued,  and  we  believe  the  stock's  total  return  will  exceed  that  of  the \nmarket over the next 12 months. BUY \n2=Market Perform: The stock appears appropriately valued, and we believe the stock's total return will be in line with the \nmarket over the next 12 months. HOLD \n3=Underperform: The stock appears overvalued, and we believe the stock's total return will be below the market over the \nnext 12 months. SELL \nSECTOR RATING \nO=Overweight: Industry expected to outperform the relevant broad market benchmark over the next 12 months. \nM=Market  Weight:  Industry  expected  to  perform  in-line  with  the  relevant  broad  market  benchmark  over  the  next  12 \nmonths. \nU=Underweight: Industry expected to underperform the relevant broad market benchmark over the next 12 months. \nVOLATILITY RATING \nV=A stock is defined as volatile if the stock price has fluctuated by +/-20% or greater in at least 8 of the past 24 months or \nif  the  analyst  expects  significant  volatility.  All  IPO  stocks  are  automatically  rated  volatile  within  the  first  24  months  of \ntrading. \n \nAs of: March 22, 2019 \n \n48% of companies covered by Wells Fargo Securities, LLC Equity \nResearch are rated Outperform. \n50% of companies covered by Wells Fargo Securities, LLC Equity \nResearch are rated Market Perform. \n2%  of  companies  covered  by Wells  Fargo  Securities,  LLC Equity \nResearch are rated Underperform. \n \nWells  Fargo  Securities,  LLC  has  provided  investment  banking \nservices  for  38%  of  its  Equity  Research  Outperform-rated \ncompanies. \nWells  Fargo  Securities,  LLC  has  provided  investment  banking \nservices  for  26%  of  its  Equity  Research  Market  Perform-rated \ncompanies. \nWells  Fargo  Securities,  LLC  has  provided  investment  banking \nservices  for  11%  of  its  Equity  Research  Underperform-rated \ncompanies. \n \n32 | Wells Fargo Securities, LLC \n \nWells Fargo Technology Weekly \n \nEquity Research \nImportant Disclosure for U.S. Clients \nThis  report  was  prepared  by  Wells  Fargo  Securities  Global  Research  Department  (“WFS  Research”)  personnel  associated  with  \nWells  Fargo  Securities  and  Structured  Asset  Investors,  LLC  (“SAI”),  a  subsidiary  of  Wells  Fargo  &  Co.  and  an  investment  adviser \nregistered with the SEC.   If research payments are made separately from commission payments, this report is being provided by \nSAI.  For all other recipients in the U.S. this report is being provided by Wells Fargo Securities. \n \nImportant Disclosure for International Clients \nEEA – The securities and related financial instruments described herein may not be eligible for sale in all jurisdictions or to certain \ncategories of investors. For recipients in the EEA, this report is distributed by Wells Fargo Securities International Limited (“WFSIL”). \nWFSIL  is  a  U.K.  incorporated  investment  firm  authorized  and  regulated  by  the  Financial  Conduct  Authority.  For  the  purposes  of \nSection 21 of the UK Financial Services and Markets Act 2000 (“the Act”), the content of this report has been approved by WFSIL, \nan authorized person under the Act. WFSIL does not deal with retail clients as defined in the Directive 2014/65/EU (“MiFID2”). The \nFCA rules made under the Financial Services and Markets Act 2000 for the protection of retail clients will therefore not apply, nor will \nthe Financial Services Compensation Scheme be available. This report is not intended for, and should not be relied upon by, retail \nclients. \nAustralia – Wells Fargo Securities, LLC is exempt from the requirements to hold an Australian financial services license in respect of \nthe  financial  services  it  provides  to  wholesale  clients  in  Australia.  Wells  Fargo  Securities,  LLC  is  regulated  under  U.S.  laws  which \ndiffer from Australian laws. Any offer or documentation provided to Australian recipients by Wells Fargo Securities, LLC in the course \nof providing the financial services will be prepared in accordance with the laws of the United States and not Australian laws. \nCanada – This report is distributed in Canada by Wells Fargo Securities Canada, Ltd., a registered investment dealer in Canada and \nmember  of  the  Investment  Industry  Regulatory  Organization  of  Canada  (IIROC)  and  Canadian  Investor  Protection  Fund  (CIPF). \nWells  Fargo  Securities,  LLC’s research  analysts may  participate  in company events  such  as  site  visits  but  are  generally  prohibited \nfrom accepting payment or reimbursement by the subject companies for associated expenses unless pre-authorized by members of \nResearch Management. \nHong Kong  – This report is issued and distributed in Hong Kong by Wells Fargo Securities Asia Limited (“WFSAL”), a Hong Kong \nincorporated investment  firm licensed  and  regulated  by  the  Securities and  Futures  Commission  of  Hong Kong (“SFC”)  to  carry  on \ntypes 1, 4, 6 and 9 regulated activities (as defined in the Securities and Futures Ordinance (Cap. 571 of The Laws of Hong Kong), \n“the SFO”). This report is not intended for, and should not be relied on by, any person other than professional investors (as defined \nin  the  SFO).  Any  securities  and  related  financial  instruments  described  herein  are  not  intended  for  sale,  nor  will  be  sold,  to  any \nperson other than professional investors (as defined in the SFO).  The author or authors of this report may or may not be licensed \nby the SFC.  Professional investors who receive this report should direct any queries regarding its contents to Mark Jones at WFSAL \n(email: wfsalresearch@wellsfargo.com ). \nJapan  –  This  report  is  distributed  in  Japan  by  Wells  Fargo  Securities  (Japan)  Co.,  Ltd,  registered  with  the  Kanto  Local  Finance \nBureau to conduct broking and dealing of type 1 and type 2 financial instruments and agency or intermediary service for entry into \ninvestment  advisory  or  discretionary  investment  contracts.    This  report  is  intended  for  distribution  only  to  professional  investors \n(Tokutei Toushika) and is not intended for, and should not be relied upon by, ordinary customers (Ippan Toushika). \nThe  ratings  stated  on  the  document  are  not  provided  by  rating  agencies  registered  with  the  Financial  Services  Agency  of  Japan \n(JFSA)  but  by  group  companies  of  JFSA-registered  rating  agencies.      These  group  companies  may  include  Moody’s  Investors \nServices Inc., Standard & Poor’s Rating Services and/or Fitch Ratings.  Any decisions to invest in securities or transactions should be \nmade after reviewing policies and methodologies used for assigning credit ratings and assumptions, significance and limitations of \nthe credit ratings stated on the respective rating agencies’ websites. \nWells Fargo Securities, LLC | 33 \nIT Hardware & Communications Networking \nEquity Research \nAbout Wells Fargo Securities \nWells Fargo Securities is the trade name for the capital markets and investment banking services of Wells Fargo & Company and its \nsubsidiaries,  including  but  not  limited  to  Wells  Fargo  Securities,  LLC,  a  U.S.  broker-dealer  registered  with  the  U.S.  Securities  and \nExchange Commission  and  a  member  of  NYSE,  FINRA, NFA and  SIPC,  Wells  Fargo Prime  Services,  LLC, a  member  of  FINRA,  NFA \nand  SIPC,  Wells  Fargo  Securities  Canada,  Ltd.,  a  member  of  IIROC  and  CIPF,  Wells  Fargo  Bank,  N.A.  and  \nWells Fargo Securities International Limited, authorized and regulated by the Financial Conduct Authority. \nThis report is for your information only and is not an offer to sell, or a solicitation of an offer to buy, the securities or instruments \nnamed  or  described  in  this  report.  Interested  parties  are  advised  to  contact  the  entity  with  which  they  deal,  or  the  entity  that \nprovided this report to them, if they desire further information or they wish to effect transactions in the securities discussed in this \nreport. The information in this report has been obtained or derived from sources believed by Wells Fargo Securities Global Research \nDepartment (“WFS Research”), to be reliable, but WFS Research does not represent that this information is accurate or complete. \nAny opinions or estimates contained in this report represent the judgment of WFS Research, at this time, and are subject to change \nwithout notice. Certain text, images, graphics, screenshots and audio or video clips included in this report are protected by copyright \nlaw and owned by third parties (collectively, “Third Party Content”). Third Party Content is made available to clients by Wells Fargo \nunder license or otherwise in accordance with applicable law. Any use or publication of Third Party Content included in this report for \npurposes other than fair use requires permission from the copyright owner. Any external website links included in this publication \nare  not  maintained,  controlled  or  operated  by  Wells  Fargo  Securities.  Wells  Fargo  Securities  does  not  provide  the  products  and \nservices on these websites and the views expressed on these websites do not necessarily represent those of Wells Fargo Securities. \nPlease  review  the  applicable  privacy  and  security  policies  and  terms  and  conditions  for  the  website  you  are  visiting.  All \nWells Fargo Securities  and  SAI  research  reports  published  by  WFS  Research  are  disseminated  and  available  to  all  clients \nsimultaneously through electronic publication to our internal client websites. Additional distribution may be done by sales personnel \nvia email, fax or regular mail. Clients may also receive our research via third party vendors. Not all research content is redistributed \nto our clients or available to third-party aggregators, nor is WFS Research responsible for the redistribution of our research by third \nparty aggregators. Equity Strategists focus on investment themes across the equity markets and sectors.  Any discussion within an \nEquity Strategy report of specific securities  is not intended to provide a fundamental analysis of any individual company described \ntherein.  The information provided in Equity Strategy reports is subject to change without notice, and investors should not expect \ncontinuing  information  or  additional  reports  relating  to  any  security  described  therein.  For  research  or  other  data  available  on  a \nparticular  security,  please  contact  your  sales  representative  or  go  to  http://www.wellsfargoresearch.com.  For  the  purposes  of  the \nU.K. Financial Conduct Authority's rules, this report constitutes impartial investment research. Each of Wells Fargo Securities, LLC \nand  Wells  Fargo  Securities  International  Limited  is  a  separate  legal  entity  and  distinct  from  affiliated  banks.  Copyright  ©  2019  \nWells Fargo Securities, LLC \n \n \nSECURITIES: NOT FDIC-INSURED/NOT BANK-GUARANTEED/MAY LOSE VALUE \n \n \n \n34 | Wells Fargo Securities, LLC \n \n"
lst = [para.split("\n\uf0b7") for para in f.split("\n \n")]

In [56]:
lst[10]

['\uf0b7  DRAMeXchange expects overall DRAM prices to fall approximately 20% q/q in 1Q19 (no surprise). 2Q19 \nand 3Q19 prices are estimated to decline by 15%-20% and ~10% q/q, respectively. ',
 '  This  includes  server  DRAM prices  estimated  to  decline  by 20%  q/q  and  ~10%  q/q  in 2Q19  and 3Q19, \nrespectively; PC DRAM prices estimated to decline at a similar rate.  eMCP pricing or mobile devices are \nestimated  to  decline  by  10%-20%  q/q  in  2Q19  and  5%-10%  q/q  in  2Q190  and  3Q19,  respectively; \ndiscrete mobile DRAM prices are expected to decline by 5%-10% q/q in both 2Q19 and 3Q19. ',
 '  DRAMeXchange  estimates  that  inventory  levels  for  DRAM  suppliers  have  increased  to  over  6  weeks \n(note:  Micron  reported  134  days  of  total  inventory  on  a  dollar  basis  exiting  their  February  quarter).  \nServer & PC customers were noted to be sitting on over 7 weeks of DRAM inventory.   ',
 '  1Ynm  DRAM  is  expected  to  be  a  key  driver  of  cont

In [57]:
getPrediction(lst[10])

[('\uf0b7  DRAMeXchange expects overall DRAM prices to fall approximately 20% q/q in 1Q19 (no surprise). 2Q19 \nand 3Q19 prices are estimated to decline by 15%-20% and ~10% q/q, respectively. ',
  array([-2.1989045, -0.1175732], dtype=float32),
  'Positive'),
 ('  This  includes  server  DRAM prices  estimated  to  decline  by 20%  q/q  and  ~10%  q/q  in 2Q19  and 3Q19, \nrespectively; PC DRAM prices estimated to decline at a similar rate.  eMCP pricing or mobile devices are \nestimated  to  decline  by  10%-20%  q/q  in  2Q19  and  5%-10%  q/q  in  2Q190  and  3Q19,  respectively; \ndiscrete mobile DRAM prices are expected to decline by 5%-10% q/q in both 2Q19 and 3Q19. ',
  array([-0.28972465, -1.3801916 ], dtype=float32),
  'Negative'),
 ('  DRAMeXchange  estimates  that  inventory  levels  for  DRAM  suppliers  have  increased  to  over  6  weeks \n(note:  Micron  reported  134  days  of  total  inventory  on  a  dollar  basis  exiting  their  February  quarter).  \nServer & PC cu

In [90]:
import os
from pdfminer.pdfparser import PDFParser
from pdfminer.pdfdocument import PDFDocument
from pdfminer.pdfpage import PDFPage
# From PDFInterpreter import both PDFResourceManager and PDFPageInterpreter
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.pdfdevice import PDFDevice
# Import this to raise exception whenever text extraction from PDF is not allowed
from pdfminer.pdfpage import PDFTextExtractionNotAllowed
from pdfminer.layout import LAParams, LTTextBox, LTTextLine
from pdfminer.converter import PDFPageAggregator

In [91]:
def extract_pdf(file):
    fp = open(file, 'rb')
    password = ""
    parser = PDFParser(fp)
    document = PDFDocument(parser, password)
    if not document.is_extractable:
        raise PDFTextExtractionNotAllowed
    rsrcmgr = PDFResourceManager()
    laparams = LAParams()

    device = PDFPageAggregator(rsrcmgr, laparams=laparams)
    interpreter = PDFPageInterpreter(rsrcmgr, device)

    extracted_text = ""
    for page in PDFPage.create_pages(document):
        interpreter.process_page(page)
        layout = device.get_result()
        for lt_obj in layout:
            if isinstance(lt_obj, LTTextBox) or isinstance(lt_obj, LTTextLine):
                extracted_text += lt_obj.get_text()
    
    return extracted_text

In [92]:
def simple_process(sent):
    return re.sub("\uf0d8", "", re.sub("\uf0b7", "", re.sub("\s+", " ", sent)))

In [93]:
def preprocess_pdf(file):
    f = extract_pdf(file)
    para = [para for para in f.split("\n \n") if len(para) > 50]
    para = [simple_process(sent) for sent in para]
    return(para)

In [97]:
pdf_list = preprocess_pdf('../Documents/ITHWEQ032219-125114.pdf')

In [98]:
getPrediction(pdf_list)

[(' Wells Fargo Technology Weekly Semi Thoughts, Weak S. Korea Memory Data, Networking Qtrly Review, & More ',
  array([-1.9451709 , -0.15427397], dtype=float32),
  'Positive'),
 ('  Strong Semiconductor Outperformance Resulting in Increased Investor Angst Ahead of 1Q19 Earnings Season? With the recent strong performance in semiconductors (SOX +1.3% and +22.1% last week and YTD vs. S&P +1% and +13%, respectively) we have begun to receive increasing investor questions on the set-up into 1Q19 earnings season. On a near-term overall basis, we think some profit taking in semis could materialize as a more cautious stance on 1Q19 earnings season materializes - most investors we have spoken with expect choppy / weak 1Q19 results and cautious 2Q19 outlooks; hope of a materializing 2H2019 recovery remains key focus. That said, we think it is becoming increasingly important to consider company- specific dynamics – we continue to highlight: (1) AMD: Shares rallied last week (+15%) on Google’s Sta

In [106]:
tokenizer.tokenize("AWS Offering New G4 Compute Instances using NVIDIA Turing-based T4 GPUs")

['aw',
 '##s',
 'offering',
 'new',
 'g',
 '##4',
 'compute',
 'instances',
 'using',
 'n',
 '##vid',
 '##ia',
 'turing',
 '-',
 'based',
 't',
 '##4',
 'gp',
 '##us']