# @author: Mudigonda Himansh

#### TODO
- [x] Tokenize the documents. 
- [x] Apply case folding.
- [x] Sort the distinct words in descending order of their frequencies. 
- [x] Remove top 30 stopwords. 
- [x] Apply Porter stemming algorithm. 
- [x] List all the terms to be included in the dictionary. 
- [x] Generate all (term, docID) pairs
- [x] Apply a sort based algorithm to build the (non-positional) inverted index (should include document frequency). 
- [x] Use Python dictionary data structure to store the inverted index.

## Information Retrieval on Cranfield Dataset

In [1]:
import re
import os
from nltk.stem import PorterStemmer
import nltk
import numpy as np
import json

## Directories

In [2]:
!trash ./cran/splitfiles/
!mkdir ./cran/splitfiles/
!trash ./cran/processed_doc/
!mkdir ./cran/processed_doc/

In [3]:
!tree

[01;34m.[0m
├── [00mPipeline.ipynb[0m
├── [00mREADME.md[0m
├── [01;34mcran[0m
│   ├── [00mcran.all.1400[0m
│   ├── [00mcran.qry[0m
│   ├── [00mcranqrel[0m
│   ├── [00mcranqrel.readme[0m
│   ├── [01;34mprocessed_doc[0m
│   └── [01;34msplitfiles[0m
└── [00mstopwords[0m

3 directories, 7 files


## Data Screening

In [4]:
with open('./cran/cran.all.1400') as main_file:
    for lineno, line in enumerate(main_file):
        print(line)

.I 1

.T

experimental investigation of the aerodynamics of a

wing in a slipstream .

.A

brenckman,m.

.B

j. ae. scs. 25, 1958, 324.

.W

experimental investigation of the aerodynamics of a

wing in a slipstream .

  an experimental study of a wing in a propeller slipstream was

made in order to determine the spanwise distribution of the lift

increase due to slipstream at different angles of attack of the wing

and at different free stream to slipstream velocity ratios .  the

results were intended in part as an evaluation basis for different

theoretical treatments of this problem .

  the comparative span loading curves, together with

supporting evidence, showed that a substantial part of the lift increment

produced by the slipstream was due to a /destalling/ or

boundary-layer-control effect .  the integrated remaining lift

increment, after subtracting this destalling lift, was found to agree

well with a potential flow theory .

  an empirical evaluation of the destalling ef


conjunction with a conventional hypersonic wind-tunnel air supply, a

means for investigating hypersonic heat transfer and surface

phenomena under conditions of flight reynolds numbers .

.I 38

.T

on the prediction of mixed subsonic/supersonic pressure

distributions .

.A

sinnott,c.s.

.B

j. ae. scs. 27, 1960, 767.

.W

on the prediction of mixed subsonic/supersonic pressure

distributions .

  high-speed wind-tunnel results are analyzed to derive a

semiempirical scheme for the prediction of transonic pressure

distributions .  the supersonic and subsonic parts of the flow are

treated separately, and then linked by an empirical shock

pressure rise relation .  the significance of the empirical results is

considered in relation to the physical mechanism of transonic

flows .  it is also shown that theoretical solutions can be

improved by introducing the empirical shock relation .

.I 39

.T

on the flow of a sonic stream past an airfoil surface .

.A

sinnott,c.s.

.B

j.ae.s

equation, relating turbulent skin friction

and boundary-layer thickness, was

utilized in a form which accounted for compressibility .

  consideration of the heat transfer to the wall permitted the wall

surface temperature, behind the wave,

to be determined .  the wall

thickness was assumed to be greater than the

wall thermal-boundary-layer

thickness .  it was found that the wall

temperature was uniform (as a

function of distance behind the wave)

for the laminar-boundary-layer case

but varied with distance for the turbulent-boundary-layer case .

.I 73

.T

investigation of the stability of the laminar boundary

layer in a compressible fluid .

.A

lees,l. and lin,c.c.

.B

naca tn.1115, 1946.

.W

investigation of the stability of the laminar boundary

layer in a compressible fluid .

  in the present report the stability of two-dimensional laminar

flows of a gas is investigated by the method of small perturbations .

the chief emphasis is placed on the case of the laminar


free convection, when a body force is acting parallel to the wall .  the

fluid is assumed to be semi-incompressible as usual .  in addition to

the obvious practical significance, this problem is also interesting in

the sense that it provides another exact solution of the

magnetohydrodynamic equations, since the only electromagnetic assumptions involved

are constant properties and freedom from excessive charges .

.I 88

.T

magnetohydrodynamic free-convection pipe flow .

.A

cramer,k.r.

.B

j. ae. scs. 28, 1961, 736.

.W

magnetohydrodynamic free-convection pipe flow .

it has been shown that transverse magnetic fields of practical

strengths exert considerable influence on liquid-metal,

free-convection, vertical, flat-plate and parallel-plate flow fields .

the extent of influence was determined by the magnitude of a

nondimensional parameter a which is the ratio of the hartmann

number to the fourth root of the grashof number, and is a

measure of the relative influence of t

well at high mach numbers .  however, it is quite inconsistent at lower

mach numbers, especially for bodies which deviate considerably from

circular cones .  the equivalent-cone method does not give satisfactory

results, mainly due to the fact that it considers only the local surface

element on the body independent of the other body elements in the

newtonian-theory manner .

.I 123

.T

the downstream influence of mass transfer at the nose

of a slender cone .

.A

cresci,r.j. and libby,p.a.

.B

j.aer.scs. 29, 1962, 815.

.W

the downstream influence of mass transfer at the nose

of a slender cone .

  the influence of localized mass transfer at the nose of a slender

cone under hypersonic flow conditions has been studied by

experimental and theoretical means .  two gaseous coolants, nitrogen

and helium, are injected through a porous plug subtending a

half angle of 30 .  the effect of the mass transfer on the shock

shape, pressure distribution, heat transfer, and transition a


the determination of turbulent skin friction by means

of pitot tubes .

  a simple method of determining

local turbulent skin friction on a smooth

surface has been developed which utilises a

round pitot tube resting on the surface .

assuming the existence of a region near the

surface in which conditions are functions

only of the skin friction, the relevant physical

constants of the fluid and a suitable length,

a universal non-dimensional relation is

obtained for the difference between the total

pressure recorded by the tube and the static

pressure at the wall, in terms of the skin

friction .  this relation, on this assumption,

is independent of the pressure gradient .

the truth and form of the relation were first

established, to a considerable degree of

accuracy, in a pipe using four geometrically

similar round pitot tubes--the diameter

being taken as representative length .  these four

pitot tubes were then used to determine

the local skin friction coefficient at

  it is found that the eucken-value

of conductivity could be exceeded

if the relaxation times are non-zero and

the plates very effective in

exciting the inert mode .  when relaxation

times are very short the effect

of the walls on the energy transfer rate

is small, but the walls make

their presence felt by distorting the

temperature profiles in /boundary

layers/ adjacent to the walls which are

of order in thickness

time) .  this result is

analogous to hirschfelder's (1956) for the

case of chemical reactions .

  for experimental measurement of

conductivity in a hot wire cell type

of apparatus it is shown that extrapolation

of measured reciprocal

conductivities to zero reciprocal pressure

should load to the full eucken

value .  it is also shown that the slope of

reciprocal apparent (measured)

conductivity versus reciprocal pressure

curves is a function of relaxation

time as well as of the accommodation

coefficients .  it is quite possible

that the relaxation ef


density wind tunnel .  agreement between theory and experiment is

quite satisfactory .

.I 184

.T

scale models for thermo-aeroelastic research .

.A

molyneux,w.g.

.B

rae tn.struct.294, 1961.

.W

scale models for thermo-aeroelastic research .

  an investigation is made of the

parameters to be satisfied for

thermo-aeroelastic similarity .  it is concluded

that complete similarity obtains

only when aircraft and model are identical

in all respects, including size .

  by limiting consideration to

conduction effects, by assuming the major

load carrying parts of the structure

are in regions where the flow is either

entirely laminar, or entirely turbulent,

and by assuming a specific

relationship between reynolds number and nusselt

number, an approach to similarity can

be achieved for small scale models .

experimental and analytical work is

required to check on the validity of these assumptions .

  it appears that existing hot wind

tunnels will not be completely

adeq

stratford,b.s. and sansome,g.e.

.B

arc r + m 3275, 1960.

.W

theory and tunnel tests of rotor blade for supersonic

turbines .

  in special circumstances where a large work

output is required from a turbine in a single stage

it is necessary to use high pressure ratios across the

nozzle blades, thus producing supersonic velocities at

inlet to the rotor .  as part of an investigation into such

turbines, several designs for the inter-blade passages of

the rotor have been tested in a two-dimensional tunnel,

a design theory being developed concurrently .

  the first design, featuring constant passage width

and curvature as in steam-turbine practice, but having

thin leading and trailing edges, was found to suffer from

focusing of the compression waves from the concave

surface, with consequent flow separation from the

opposite convex surface .  it gave a velocity coefficient of

measured at an inlet mach number of 1.90 and turning

angle of 140 deg .  the measured value compa


navier-stokes solutions at large distances from a finite body .

.A

i-dee chang

.B

.W

navier-stokes solutions at large distances from a finite body .

this paper is concerned with a theoretical investigation of the flow

field at large distances from an object moving through a viscous fluid .

the discussion will be restricted to the case of two-dimensional

stationary incompressible flow .  the object will be assumed to be of

finite size .  the domain of the fluid is infinite and it is assumed

that there are no other boundaries for the fluid except that of the

given object .  the reynolds number will be assumed to have a fixed

value., thus we shall not consider the limiting cases of the reynolds

number tending to zero or to infinity .

.I 229

.T

interference between the wings and tail surfaces of

a combination of slender body, cruciform wings and

cruciform tail set at both incidence and yaw .

.A

owen,p.r. and anderson,r.g.

.B

rae r.aero.2471, 1952.

.W

interference 

  an ideal problem is here discussed .  a finite

amount of energy is suddenly released

in an infinitely concentrated form .  the motion

and pressure of the surrounding air is

calculated .  it is found that a spherical shock

wave is propagated outwards whose

radius r is related to the time t since the explosion

started by the equation

where is the atmospheric density, e is

the energy released and s a calculated

function of, the ratio of the specific heats of air .

  the effect of the explosion is to force most

of the air within the shock front into a

thin shell just inside that front .  as the front

expands, the maximum pressure

decreases till, at about 10 atm., the analysis ceases

to be accurate .  at 20 atm. 45 of

the energy has been degraded into heat which is

not available for doing work and used

up in expanding against atmospheric pressure .

this leads to the prediction that an

atomic bomb would be only half as efficient, as

a blast-producer, as a high explosi


was used for the static-pressure surveys and for some of the

schlieren photographs .  in order to determine the flow conditions several

blade chords downstream of the cascade, schlieren photographs were

taken of the flow through a cascade of 18 blades having an axial

width of 0.60 inch .

  for the blade design studied, even at static-to-total pressure

ratios considerably lower than that required to give critical

velocity at the throat section, the flow was deflected in the tangential

direction as predicted for the incompressible case .  as the pressure

ratio was lowered further, the aerodynamic loading of the rear

portion of the blade reached a maximum value and remained constant .

after this condition was attained, the expansion downstream of the

cascade took place with a constant tangential velocity so that no

further increase in the amount of turning across the blade row and

no further increase in the loading of the blade was available .

.I 278

.T

on source and vor

presented with consideration of various model configurations .

  the method has been applied to various types of configurations in

several wind-tunnel investigations conducted by the national advisory

committee for aeronautics at mach numbers up to 4, and in all cases

the calculated roughness height caused premature boundary-layer

transition for the range of test conditions .

.I 315

.T

scale effects at high subsonic and transonic speeds

and methods for fixing transition in model experiments .

.A

haines,a.b., holder,d.w. and pearcey,h.h.

.B

arc r + m3012, 1954.

.W

scale effects at high subsonic and transonic speeds

and methods for fixing transition in model experiments .

  the major scale effects at high subsonic and transonic

speeds arise from differences between the conditions

under which laminar and turbulent boundary layers separate, and in

how they behave after separation .  for turbulent

boundary layers, these conditions and behaviour do not vary greatly

as t


with constant specific heats and to bodies with pointed noses are

removed .  only steady plane or axisymmetric flows are

considered .

  inspection of the governing system of equations shows that a

similitude law exists for flow fields, under local thermal

equilibrium, having the same free-stream atmosphere .  for flows of

ideal gas with constant specific heats, the requirement of the same

free-stream atmosphere--i.e., the same composition, pressure, and

density--can be replaced by the requirement of the same ratio of

specific heats .

  for flows over blunted wedges or cones, special laws of

similitude can be obtained .

  application of the similarity rules is examined for the case of

hypersonic flows of an ideal gas with over flat plates

with blunt leading edges, and for the case of equilibrium air

flows over wedges .  the possibility of simulating nonequilibrium

flows over slender or thin bodies is also pointed out .

.I 333

.T

boundary-layer interaction on a yawed 

results with those obtained by giese and bergdolt for 15

half-angle cones at m = 2.45 .  following the observation by charters

and stein that drag coefficient measurements on blunted cones

imply a reynolds number effect, giese and bergdolt study the

convergence to conical flow of the perturbed flow about a cone

with truncated tip .  they employ the mach-zehnder

interferometer and the conical flow criterion as analytical tools .

.I 372

.T

an experimental investigation of flow about simple

blunt bodies at a nominal mach number of 5. 8.

.A

oliver,r.e.

.B

j. ae. scs.23, 1956, 177.

.W

an experimental investigation of flow about simple

blunt bodies at a nominal mach number of 5. 8.

  an experimental investigation was

conducted in the galcit hypersonic

wind tunnel to determine flow characteristics

for a series of blunt bodies at a

nominal mach number of 5.8 and free-stream

reynolds numbers per in. of

measured values for the pressure

coefficient distributions are compa


j.ae.scs. 24, 1957.

.W

the shear flow along a flat plate with uniform suction .

  recently, several authors have investigated the boundary

layer in a shear flow .  in this note, an exact solution of the

navier-stokes equations will be presented, which represents the

boundary layer along an infinite flat plate with uniform suction

situated in a shear flow .

.I 394

.T

the viscous flow near a stagnation point when the external flow has

uniform vorticity .

.A

j. t. stuart

.B

national physical laboratory, teddington, middlesex, england

.W

the viscous flow near a stagnation point when the external flow has

uniform vorticity .

in view of the recent controversy between li and glauert on the nature

of the solution of the boundary-layer equations when the external

flow is rotational, it seems worthwhile to draw attention to a certain

exact solution of the navier-stokes equations which lends support to

glauert's point of view .

.I 395

.T

new methods in heat flow analysi

in the transformed plane .  the kutta-joukowsky

condition of finite velocity at the trailing edge also leads to a

condition on in this plane .  from these conditions and the

general expression for the circulation and the strengths

of the doublets and quadruplets required for the force and

moment are determined .  hence, the formulae for lift and

moment coefficient are obtained .  these involve, in addition

to the usual (potential-flow) terms, terms proportional to .

  the ten functions that appear in the expressions for the

lift and moment coefficients are tabulated for values of the

thickness ratio between 0 and 1 .  the aerodynamic-center

position and the coefficient of the moment about the

aerodynamic center are also calculated and are presented

graphically as functions of .

.I 453

.T

the influence of two-dimensional stream shear on airfoil maximum lift .

.A

.B

.W

the influence of two-dimensional stream shear on airfoil maximum lift .

the cornell aeronautical la


body and airfoil shapes and its predictions are

compared with experiments and results

of other theoretical investigations .

.I 470

.T

some notes for the small disturbance linear theory

of the method of local linearisation of the flow over

an airfoil at mach number of unity .

.A

miyai,y.

.B

proc. 10th japan nat. cong. app. mech. iii-4, 1960, 207.

.W

some notes for the small disturbance linear theory

of the method of local linearisation of the flow over

an airfoil at mach number of unity .

  in this paper, the pressure

distribution at the surface of a symmetrical

non-lifting aerofoil with free stream

mach number of unity has been

investigated by means of the small-disturbance

linear theory or the method of local

linearization .  and by comparing with

the calculated results based on an

hodograph method, the accuracy of these

approximate methods has been

evaluated .  moreover, when these approximate

methods are used for the calculation

of the pressure coefficie

inman,r.m.

.B

j. ae. scs. 29, 1962, 1014.

.W

consideration of energy separation for laminar slip

flow in a circular tube .

  the energy separation for laminar low-density-nonunity prandtl

number flow in circular cross-section tubes is the topic of this

note .  a conclusion is reached as to the effect of prandtl number

on the velocity profiles for these flows .  however, in order to

reach valid quantitative conclusions the reviewer feels that more

detailed analysis is in order, and that the analysis as presented

here is of qualitative value only .

.I 535

.T

shroud design for simulating hypersonic flow over the nose of a

hemisphere .

.A

roger dunlap

.B

associate research engineer, dept, of aeronautical and

astronautical engineering, ann arbor, mich.

.W

shroud design for simulating hypersonic flow over the nose of a

hemisphere .

following is an analytical method for designing a shroud which will

generate the hypersonic pressure distribution on a hemisphere .  the


chemical kinetics of high temperature air .

  when a hypersonic object enters earth's atmosphere, a shock

wave is formed in front of it, and the air passing through this

shock wave is heated to high temperatures .  the shock heated

molecules equilibrate their translational and rotational

degrees of freedom within a distance of a few mean free paths .

to achieve equilibrium, it is necessary to excite vibration,

dissociate molecules, produce new molecules and produce ions

and electrons .  the problem is complex, since all these

phenomena occur simultaneously and because the reaction rates depend

on the temperature, density and composition which are changing

during the relaxation toward equilibrium .

  the experimental techniques used to investigate these

reactions are briefly discussed along with the resulting rate

expressions obtained by the various investigators .  a compilation

of the rate expressions for these reactions representing the

author's evaluation of all the

mangler,k.w. and smith,j.

.B

rae r. aero.2593, 1957.

.W

calculation of the flow past slender delta wings with

leading edge separation .

  the flow past a slender delta

wing with a sharp leading edge, at

incidence, usually separates along

this edge, i.e. a vortex layer extends

from the edge and rolls up to form

a /core/ (a region of high vorticity) .

a potential flow model of this is

constructed in which the layer is

replaced by a vortex sheet which

is rolled up into a spiral in the region

of the /core/ .  this problem

is reduced to a two-dimensional one by

assuming a conical field and using

slender wing theory .  the shape and strength

of the sheet are determined by

the two conditions that it is a stream

surface and sustains no pressure

difference .  use is made of results

previously obtained for the core

region and the remaining finite part of

the sheet is dealt with by choosing

certain functions for its shape and

strength .  the parameters in these

functi


cruise performance of channel-flow ground effect machines .

  the performance theory for high-speed air-cushion vehicles

operating in close proximity to the ground is developed .  the

analysis is restricted to cruise flight of vehicles of rectangular

planform employing an air pressure seal between the ground and

the vehicle along the two streamwise sides .  the variation of

the optimum rearward deflection angle of the side jet pressure

seal with speed for minimum overall power expenditure and

maximum range is found .  it is concluded that a mixed

propulsion system (jet deflection plus propeller(s)) is required .

volume flow and the corresponding fan pressure rise needed are

also calculated .  the maximum lift drag ratio is determined .

  the maximum thickness ratios of the vehicles are considered

to be large compared with the ground-height vehicle-length

ratio .  two-dimensional airfoil theory is employed to show that

close to stagnation conditions exist below the vehic

finite cylinders can be obtained .  the two

shear distributions in the free stream can

be approximated to the linear shear

distribution and the shear present in an

unretarded incompressible boundary layer

respectively .  in every case the stagnation

streamline is displaced from the position

opposite the line of symmetry of the

cylinder, and general expressions are

obtained for this displacement .  the line of

symmetry may be in the direction of

or perpendicular to the direction of flow .

the two particular examples cited are

those of a general elliptic cylinder and

cylinders of the form where and being the polar

coordinates, and 2p the maximum width

of the cylinder .

.I 660

.T

the fundamental solution for small steady three dimensional

disturbances to a two dimensional parallel shear flow .

.A

lighthill,m.j.

.B

j. fluid mech. 3, 1957, 113.

.W

the fundamental solution for small steady three dimensional

disturbances to a two dimensional parallel shear flow .

 

hanson,p.w.

.B

nasa tn.d984, 1961.

.W

aerodynamic effects of some configuration variables

on the aeroelastic characteristics of lifting surfaces

at mach numbers from 0. 7 to 6. 86 .

  results of flutter tests on

some simple all-movable-control-type

models are given .  one set of models,

which had a square planform with

double-wedge airfoils with four

different values of leading- and

trailing-edge radii from 0 to 6 percent chord

and airfoil thicknesses of 9, 11,

at mach numbers from 0.7 to 6.86 .

the bending-to-torsion frequency

ratio was about 0.33 .  the other set of

models, which had a tapered planform

with single-wedge and double-wedge

airfoils with thicknesses of 3, 6, 9,

and 12 percent chord, was tested

at mach numbers from 0.7 to 3.98 and

a frequency ratio of about 0.42 .

  the tests indicate that, in general,

increasing thickness has a

destabilizing effect at the higher mach

numbers but is stabilizing at

subsonic and transonic mach numbers .

double-w

lifting reentry configuration having folding wingtip panels . the

configuration is of the type used in a high angle-of-attack /near 90degree/

 reentry to minimize aerodynamic heating . by unfolding the wingtip

panels into the airstream, a moderate angle-of-attack glide is used for

a controlled landing . the basic configuration tested utilized a

whose area was 25 percent of the total wing area . the effects of

varying the plan form and size of the wingtip panels was studied as well

 as the effects of unfolding the wingtip panels in a high angle-

of-attack attitude . tests were made at mach numbers of 0.40, 0.60, and

.I 712

.T

low-speed longitudinal aerodynamic characteristics associated with a

series of low-aspect ratio wings having variations in leading-edge

contour .

.A

spencer, b. and hammond, a.d.

.B

nasa tn d-1374, 1962 .

.W

low-speed longitudinal aerodynamic characteristics associated with a

series of low-aspect ratio wings having variations in leading-edge

co


structures department of the dvl concerning the elastic stability of

isotropic and orthotropic cylindrical shells loaded in axial compression

and internal pressure .  these studies are based on the nonlinear theory

of finite deformations .  the theoretical rsults will be compared with

new experimental results obtained with a series of axially loaded

pressurized isotropic and orthotropic cylindrical shells .

.I 744

.T

lower buckling load in the non-linear buckling theory

of thin shells .

.A

tsien,h.s.

.B

q. app. math. 5, 1947, 236.

.W

lower buckling load in the non-linear buckling theory

of thin shells .

  for thin shells the relation between

the load p and the deflection beyond the

classical buckling load is very often

non-linear .  for instance, when a uniform thin

circular cylinder is loaded in the axial

direction, the load p when plotted against the

end-shortening has the characteristic

shown in fig. 1 .  if the strain energy s and the

total potential are c


.I 777

.T

a technique for rendering approximate solutions to

physical problems uniformly valid .

.A

lighthill,m.j.

.B

phil. mag. 40, 1949, 1179.

.W

a technique for rendering approximate solutions to

physical problems uniformly valid .

  a method is described for treating

some of the characteristically

non-linear problems of physics, in

particular those involving a non-linear

partial differential equation for

which an approximate linearization is

permissible everywhere except in a

limited region, such as the

neighbourhood of (5) a singular characteristic

of the approximate solution, or of

approximation is valueless .  the

method involves a transformation of

an independent variable, which is

determined progressively with successive

approximations to the solution ..

only one step being necessary if a

first approximation valid uniformly

be obtained .  the method is most

easily understood in its application

to simple first order ordinary

differential equation

most severe during the climb and descent

phases of the flight plan .

during the cruise phase of the flight,

the boom pressures are of much

lesser intensity but are spread laterally

for many miles .  the manner

in which the airplane is operated appears

to be significant,. for example,

the boom pressures during the climb,

cruise, and descent phases can be

minimized by operating the airplane at

its maximum altitude consistent

with its performance capabilities .

.I 811

.T

an investigation of lifting effects on the intensity

of sonic booms .

.A

morris,j.

.B

j. roy. aero. soc. 64, 1960, 610.

.W

an investigation of lifting effects on the intensity

of sonic booms .

  this paper is a brief summary of

an investigation made to check the effect of lift

on the shock noise of aircraft flying at supersonic

speeds .  the method of hayes has been

combined with the theory of whitham to

predict the asymptotic shock strength of wings

carrying lift and of combinations of bodie

pressure, while the other depends upon the dimensions of the shell .

the equations are solved for several ranges of the parameters

under boundary conditions corresponding to a fixed edge .

  the solution, carried out numerically on the aec univac

at new york university, yields a complete description of the

stresses and deflections as functions of the polar angle over a

wide range of values of the loading parameter and the

dimensional parameter .  prediction of the upper buckling load is then

made by means of a numerical criterion based on the load vs.

deflection curve .  for some cases, the postbuckling behavior

is investigated .  the results agree well with existing

experimental and theoretical studies and cover a wide range of cases

not previously treated .

.I 831

.T

buckling of shallow shells under external pressure .

.A

reiss,e.l.

.B

j. app. mech. 25, 1958,556.

.W

buckling of shallow shells under external pressure .

a formula for the initial buckling loads for


  since the conventional elastic analysis of thermal stress problems

coupled with limiting creep rates and time-dependent fracture stresses

as (inelastic) design criteria, results in design procedures for thermal

stresses (in heat exchangers, nuclear reactors, flight structures at

supersonic speeds, etc.) of considerable unreality, the effect of

various types of rheological behavior (viscoelastic, plastic, work

hardening) on the level of thermal stresses is analyzed under simplified

assumptions, such as uniaxial stress and polar or cylindrical symmetry .

the effect on the thermal stress intensity of the rheological behavior

of the material is shown to be very significant, particularly with

respect to stress relaxation and the development of residual stresses .

.I 871

.T

steady-state creep through dislocation climb .

.A

weertman,j.

.B

j.app. phys. 28, 1957, 362.

.W

steady-state creep through dislocation climb .

  a dislocation climb creep model is considered which d

pitot-static head .  the tests were made

during september and october, 1951 .

.I 905

.T

comparative tests of pitot-static tubes .

.A

merriam,k.g. and spaulding,e.r.

.B

naca tn.546.

.W

comparative tests of pitot-static tubes .

  comparative tests were made on seven conventional

pitot-static tubes to determine their static, dynamic, and

resultant errors .  the effect of varying the dynamic

opening, static openings, wall thickness, and inner-tube

diameter was investigated .  pressure-distribution measurements

showing stem and tip effects were also made .  a tentative

design for a standard pitot-static tube for use in

measuring air velocity is submitted .

  this report covers an investigation conducted under

the auspices of the national research council .

.I 906

.T

review of the pitot tube .

.A

folson,r.g.

.B

asme trans. 70, 1956.

.W

review of the pitot tube .

  this paper is an attempt to bring together the

important information regarding pitot tubes and the


fundamentally the same as those which determine the critical

state for transition at an artificial disturbance of a three-

dimensional character .

.I 934

.T

stability of cylindrical and conical shells of circular

cross section, with simultaneous action of axial compression

and external normal pressure .

.A

mushtari,k.m. and sachenkov,a.v.

.B

naca tm.1433, 1958.

.W

stability of cylindrical and conical shells of circular

cross section, with simultaneous action of axial compression

and external normal pressure .

  we consider in this report the

determination of the upper limit of

critical loads in the case of simultaneous

action of a compressive force,

uniformly distributed over plane cross

sections, and of isotropic external

normal pressure on cylindrical or

conical shells of circular cross

section .  as a starting point we use

the differential equations for neutral

equilibrium of conical shells (ref. 1)

which have been used for the

solution of the problem of

where k is the thermal conductivity of the fluid, o its prandtl number,

p its density, u its viscosity, r(x) is the skin friction,

and t(x) the excess of wall temperature over main stream temperature .

a critical appraisement of the formula indicates that it should be very

accurate for large, but that for of order 0.7 (for most gases) the

constant should be replaced by 0.73, when the error should not exceed

this yields a formula for nusselt number in terms of the reynolds number

r and the mean square root of the skin friction coefficient c, in the

case of uniform wall temperature .

however, for the boundary layer with uniform main stream, the

original formula is accurate to within 3 percent even for .  by

known transformations an expression is deducted for heat

transfer to a surface, with arbitrary temperature distribution

along it, and with a uniform stream outside it at arbitrary

mach number (equation (42)) .  from this the temperature distribution

along such a surface


supersonic tunnel at a constant mach number of 1.5 .

the high-velocity jet was observed to alter the pressure distribution

over the body in such a way that the pressure drag of the body was

reduced,. thus, in a restricted sense, the nose jet produced a thrust on

 the body . under the conditions investigated, the thrust produced by

the nose jet was never so large as that which would be expected from a

conventional rearward jet . for example, under the best conditions

tested /mach number of 1.07/ the reduction in body pressure drag caused

by the nose jet more than compensated for the negative thrust of the jet

 itself . however, the magnitude of the net reduction in drag /change in

 body pressure drag with jet on and jet off minus the adverse thrust of

the jet/ was only about one-half of the thrust which would be produced

by the same jet exhausting rearward . the appearance of such an

unexpectedly large effect in the first trial indicated the phenomenon to be

worth further

the method may prove to be useful because

of its simplicity .

  a presentation of the results of an

experimental investigation of the

effects of column imperfection and

column-material variation is made .  it

is found that column-capacity variations of

the order of 10 per cent can

result from column-imperfection differences

and column-material variation .

  the results of an experimental study of

the variation of column

capacity with temperature of exposure are presented .

they indicate that column

efficiency, as measured by decrease in capacity,

can be acceptable for very

short times at the higher temperatures .  the

efficiency at these higher

temperatures falls rapidly, however, with increasing time .

.I 1020

.T

note on creep buckling of columns .

.A

.B

.W

note on creep buckling of columns .

  the results of short-time

creep-buckling and creep-bending tests of

slenderness ratio 111 are presented .

the tests were performed at 600 f,

and strain measurement


and outer edges,. these edges are otherwise free .  several

approximations have been proposed to describe the behavior of

these springs .  a first approximation (1) is based on the

assumption that meridional strains are negligible .  this requires that the

shell remain conical after deformation and also that the

extensional strain of meridional lines on the middle surface vanish .

another approximation (2) retains only the assumption that the

shell remains conical .  the first assumption satisfies neither of

the two boundary conditions at the free edges,. the latter

violates the condition of vanishing moment at the free edges .

recently the authors presented a series solution (3) for a special

case, namely, the case of an annular plate under similar loading .

numerical solutions for the shallow conical shell under these

conditions of load have also been obtained (4) .  an examination

of these results indicates that the meridional bending stresses are

of much smaller mag

primarily for use in the numerical solution

of partial differential equations .  consideration

is given to the form, as well as to

the magnitude, of the leading terms in the

error, and what is believed to be for

most purposes optimum combinations are

thus selected for the simpler compact

sets of nodes .

.I 1080

.T

viscous flow round a sphere at low reynolds numbers . /l40/ .

.A

jenson, v.g.

.B

proc. roy. soc. series, a. v. 249, p. 346, 1959 .

.W

viscous flow round a sphere at low reynolds numbers . /l40/ .

relaxation methods are outlined, and the present problem formulated in

modified spherical polar co-ordinates . the results of calculations made

  for r 5, 10, 20, 40 are presented in the form of stream function and

vorticity distributions,. and further results of pressure distributions,

 velocity distributions, and drag coefficients, calculated from them .

these results are shown to compare favourably with experimental work,

showing a steady trend from symmetri


damping measured at high angles of attack was generally larger

than that at low angles of attack .

.I 1116

.T

general instability of stiffened cylinders .

.A

becker,h.

.B

naca tn.4237, 1958.

.W

general instability of stiffened cylinders .

  theoretical buckling stresses

are determined in explicit form for

circular cylinders with

circumferential and axial stiffening .  the

loadings are axial compression, radial

pressure, hydrostatic pressure,

and torsion .  analyses were confined

to moderate-length and long

cylinders .  the investigation was based

upon the use of a form of donnell's

equation derived by taylor which is

applicable to orthotropic cylinders .

the derivation of this equation is

presented in this report .

.I 1117

.T

stability of orthotropic cylindrical shells under combined

loading .

.A

hess,t.e.

.B

ars j. 31, 1961.

.W

stability of orthotropic cylindrical shells under combined

loading .

  the increasing use of fiber and whisker

reinforced


.I 1135

.T

limit analysis of symmetrically loaded thin shells

of revolution .

.A

drucker,d.c. and shield,r.t.

.B

j. app. mech. 26, 1959, 61.

.W

limit analysis of symmetrically loaded thin shells

of revolution .

the yield surface for a thin cylindrical shell

is shown to be a very good approximation to

the yield surface for any symmetrically

loaded thin shell of revolution .  hexagonal

prism approximations to this yield surface,

appropriate for pressure vessel analysis, are

described and discussed in terms of limit

analysis .  procedures suitable for finding

upper and lower bounds on the limit pressure

for the complete vessel are developed and

evaluated .  they are applied for illustration

to a portion of a toroidal zone or knuckle held

rigidly at the two bounding planes .  the

combined end force and moment which can be

carried by an unflanged cylinder also is discussed .

.I 1136

.T

design of thin walled torispherical and toriconical

pressure - vessel heads 


.T

the buckling of cylindrical shells under longitudinally varying loads .

.A

v. i. weingarten

.B

space technology laboratories, inc., los angeles, calif.

now with aerospace corp., los angeles, calif.

.W

the buckling of cylindrical shells under longitudinally varying loads .

two problems illustrating the effect of nonuniformity of loading

on the buckling characteristics of circular cylinders are

investigated .  the first problem deals with the effect of

linearly varying axial compressive stress, such as would be

produced by the weight of the propellant in a solid-propellant

engine case .  the results indicate that the ratio of the maximum

critical compressive stress induced by the shear load to the

critical uniform compressive stress varies from 1.9 for

the curvature parameter z equal to 1.6 as z

becomes infinite .  in particular, the increase in stress is

less than 20 per sq. ft. for z greater than 100 .

the stability of thin cylinders loaded by lateral external




.T

an integral method for calculation of supersonic laminar

boundary layer with heat transfer on yawed cone .

.A

yen,shee-mang and thyson,n.a.

.B

aiaa jnl. 1, 1963, 672.

.W

an integral method for calculation of supersonic laminar

boundary layer with heat transfer on yawed cone .

  an integral method for calculating the three-dimensional

boundary layer over the surface of

a cone at angle of attack is investigated .

the numerical procedure of integration for that

method on the basis of a simplifying assumption

concerning the boundary layer development

along the cone generator is developed and illustrated

by applying the method to find the

solutions of integral equations for a specific example .

the results obtained for the example for the

range of circumferential angle of 40 investigated

are summarized and given as heat transfer

coefficients, coefficients of friction, and other

friction parameters .  the distribution of heat

transfer coefficients checked with ava


.B

j. ae. scs. 1962, 170.

.W

inviscid-incompressible-flow theory of static two-dimensional

solid jets, in proximity to the ground .

  the inviscid-incompressible-flow theory of static two-

dimensional solid jets impinging orthogonally on the ground is

presented using conformal mapping methods .

  it is shown that the thrust of a solid jet at constant power

initially decreases as the ground is approached .  the

magnitude of the thrust out of ground effect is regained only at a very

low height-to-jet width ratio (approximately 0.55) .  the

maximuin decrease is about 6 percent .  the ground effect on solid

jets is thus largely unfavorable .

.I 1224

.T

on the plk method and the supersonic blunt-body problem .

.A

vaglio-laurin,r.

.B

j. ae. scs. 1962, 185.

.W

on the plk method and the supersonic blunt-body problem .

  detailed analysis of the subsonic and transonic portious of the

flow field about either very blunt or asymmetric configurations

requires successive ap

increase the polar peak noise level by several db .  the silencing

action of the edge notches and edge teeth may also be

interpreted as due apparently to the result of possible distortion of

the shear-layer profiles .

.I 1245

.T

some aspects of nonequilibrium flows .

.A

sedney,r.

.B

j. ae. scs. 1961, 189.

.W

some aspects of nonequilibrium flows .

  in this paper are discussed some of the general features of

nonequilibrium flow .  in particular, vibrational relaxation is

discussed in detail .  this case is somewhat simpler than dissociation

and ionization but it illustrates some of the main new features of

nonequilibrium flow .  those aspects of two-dimensional and

axisymmetric flow behind shock waves are examined analytically

which yield significant information without requiring numerical

solution of the governing equations .

  the thermodynamics of a vibrational relaxing gas are

discussed .  the conditions for simulating flows are noted .  crocco's

theorem and t


memorandum l57l10 .  a limited amount of

data was obtained in freon-12 .

  results of the tests in air

and in freon-12 are in good agreement

with the flutter calculations at

all mach numbers .  the test data

compare favorably with previously

published transonic flutter data for the

same wing planform .  the results

indicate that flutter characteristics

obtained in freon-12 may be interpreted

directly as equivalent flutter

data in air at the same mass ratio and mach number .

.I 1291

.T

atmosphere entries with spacecraft lift-drag ratios

modulated to limit decelerations .

.A

levy,l.l.

.B

nasa tn.d1427, 1962.

.W

atmosphere entries with spacecraft lift-drag ratios

modulated to limit decelerations .

  an analysis has been made of

atmosphere entries for which the

spacecraft lift-drag ratios were

modulated to limit the maximum

deceleration .  the parts of the drag polars

used during modulation were from

maximum lift coefficient to minimum

drag coefficient .  fi

upon bodies of revolution at angles of

incidence, while neglecting .

general formulae are

established for the coefficients of axial

force, normal force and moments .

the formulae are developed according

to the powers of incidence, the first

terms of each formula being of very simple form .

.I 1305

.T

a proposed programme of wind tunnel tests at hypersonic

speeds to investigate the lifting properties of geometrically

slender shapes .

.A

peckham,d.h.

.B

rae tn.aero.2730.

.W

a proposed programme of wind tunnel tests at hypersonic

speeds to investigate the lifting properties of geometrically

slender shapes .

  a programme of tests at hypersonic

speeds on slender bodies is

described, which has the aim of

investigating how lift is generated, and

the compromises that may be enforced

by aerodynamic heating .  the programme

is based on models of simple geometric

shape, from which lifting

configurations will later be built up .

.I 1306

.T

experiments on circular c

the use of freon-12 as a

substitute medium for air in aerodynamic

testing have been made .  the

use of freon-12 instead of air makes

possible large savings in

wind-tunnel drive power .  because of the

fact that the ratio of specific

heats is approximately 1.13 for freon-12

as compared with 1.4 for air,

some differences exist between data

obtained in freon-12 and in air .

methods for predicting aerodynamic

characteristics of bodies in air

from data obtained in freon-12, however,

have been developed from the

concept of similarity of the streamline

pattern .  these methods,

derived from consideration of two-dimensional

flows, provide substantial

agreement in all cases for which comparative

data are available .  these

data consist of measurements throughout

a range of mach number from

approximately 0.4 to 1.2 of pressure

distributions and hinge moments on

swept and unswept wings having aspect

ratios ranging from 4.0 to 9.0,

including cases where a substantial

pa

.A

love,e.s.

.B

naca tn.3709, 1956.

.W

aerodynamic investigation of a parabolic body of revolution

at mach number of 1. 92 and some effects of an annular

supersonic jet exhausting from the base .

  an aerodynamic investigation

of a parabolic body of revolution was

conducted at a mach number of 1.92

with and without an annular supersonic

jet exhausting from the base .

measurements with the jet inoperative were

made of lift, drag, pitching moment,

radial and longitudinal pressure

distributions, and base pressures .

with the jet in operation,

measurements were made of the pressures

over the rear of the body with the

primary variables being angle of attack,

ratio of jet velocity to

freestream velocity, and ratio of jet

pressure to stream pressure .

  the results with the jet inoperative

showed that the radial

pressures over the body varied appreciably

from the distribution generally

employed in most approximate theories .

the linearized solutions for lift,

pit

which do not require preceding function values

to be known .  from a general theory given by

kutta, one such process is chosen giving

fourth-order accuracy and requiring the minimum

number of storage registers .  it is developed into

a form which gives the highest attainable

accuracy and can be carried out by comparatively

few instructions .  the errors are studied and

a simple example is given .

.I 1389

.T

numerical construction of detached shock waves .

.A

garabedian,p.r.

.B

j.math.phys. 36, 1957, 192.

.W

numerical construction of detached shock waves .

  this article proposes a new method for solving the problem

of the detached shock wave .  if the shock wave is assumed

known, a cauchy problem for a system of partial differential

equations arises .  this has been solved by several authors

in the region where the system is elliptic (near the peak of

the shock wave) .  considering the plane stationary case, the

author seeks an analytic continuation of the propa

## Split Dataset

In [5]:
with open('./cran/cran.all.1400', 'r') as f:
    content = f.readlines()
    for line in content:
      if line.startswith('.I'):
          f_ = line.split('.I')[1]
          name = './cran/splitfiles/' + f_[:-1].strip() + '.txt'
          f1 = open(name, 'w')
          f1.write(line)
          f1.close()
      else:
          with open(name, 'a') as f1:
              f1.write(line)
          f1.close()
    f.close()

In [6]:
!cat ./cran/splitfiles/1.txt

.I 1
.T
experimental investigation of the aerodynamics of a
wing in a slipstream .
.A
brenckman,m.
.B
j. ae. scs. 25, 1958, 324.
.W
experimental investigation of the aerodynamics of a
wing in a slipstream .
  an experimental study of a wing in a propeller slipstream was
made in order to determine the spanwise distribution of the lift
increase due to slipstream at different angles of attack of the wing
and at different free stream to slipstream velocity ratios .  the
results were intended in part as an evaluation basis for different
theoretical treatments of this problem .
  the comparative span loading curves, together with
supporting evidence, showed that a substantial part of the lift increment
produced by the slipstream was due to a /destalling/ or
boundary-layer-control effect .  the integrated remaining lift
increment, after subtracting this destalling lift, was found to agree
well with a potential flow theory .
  an empirical evaluation of the destalling ef

## Text Preprocessing

### Creating our own stoplist

In [7]:
with open('./cran/cran.all.1400', 'r') as f:
    content = f.read()
    word_tokens = nltk.word_tokenize(content)
    fdist_mis = nltk.FreqDist(word_tokens)
    for i in fdist_mis:
        print(i+ ":"+ str(fdist_mis[i]))


the:20192
.:15740
of:13976
,:10889
and:7097
a:6509
in:5009
to:4686
is:4116
for:3711
with:2441
are:2430
on:2326
at:2075
flow:2057
by:1794
that:1569
an:1516
.W:1402
.A:1401
.B:1401
.I:1400
.T:1400
be:1272
pressure:1243
from:1165
as:1123
this:1081
boundary:1071
number:1023
which:981
mach:904
results:897
layer:877
theory:858
it:855
method:730
was:698
(:662
shock:650
):647
supersonic:621
effects:597
surface:592
been:591
heat:590
were:583
obtained:542
solution:527
given:522
temperature:520
laminar:513
equations:508
these:500
velocity:498
or:498
wing:486
has:482
body:470
experimental:466
ratio:462
effect:460
hypersonic:459
j.:456
made:449
transfer:444
analysis:440
can:433
two:429
presented:425
conditions:425
found:422
plate:420
numbers:415
some:413
jet:411
over:411
distribution:409
have:406
reynolds:404
problem:391
between:385
range:384
solutions:381
bodies:377
buckling:363
also:362
case:362
investigation:356
air:354
than:348
shown:347
not:343
turbulent:338
wings:336
data:334
aerodynamic:324


extending:18
18:18
impeller:18
vector:18
lees:18
he:18
replaced:18
dynamics:18
what:18
electrical:18
strengths:18
need:18
mounted:18
propellers:18
divided:18
showing:18
station:18
opposite:18
depth:18
attributed:18
great:18
thicknesses:18
loadings:18
interior:18
common:18
flying:18
root:18
tends:18
extreme:18
employing:18
r.j:18
throat:18
simpler:18
perpendicular:18
round:18
dissociating:18
jones:18
quart:18
shearing:18
represents:18
annular:18
plotted:18
find:18
separate:18
corner:18
concentrated:18
systematic:18
180:18
reduces:18
torsional:18
covered:18
afterbodies:18
inclined:18
asme:18
severe:18
reentry:18
feature:17
double:17
down:17
primarily:17
extremely:17
largely:17
discusses:17
yet:17
rarefied:17
g.:17
allows:17
practice:17
furthermore:17
depending:17
interpreted:17
horizontal:17
crocco:17
quasi-steady:17
derivation:17
half-angle:17
noted:17
family:17
identical:17
hypervelocity:17
did:17
distortion:17
understanding:17
mechanical:17
/:17
products:17
isotropic:17
action:17
flam

porous-wall:6
johns:6
exchange:6
consequence:6
magnetohydrodynamics:6
bagley:6
50,000:6
flutter-speed:6
w.a:6
plastics:6
vapour:6
complementary:6
erfc:6
tabulation:6
w.p:6
york:6
rotors:6
4.0:6
margin:6
luminous:6
king-hele:6
vaporization:6
rational:6
/2/:6
stone:6
h.l:6
flexure:6
rotary:6
farnborough:6
diurnal:6
machines:6
fan:6
vibrating:6
methane-air:6
beneath:6
moved:6
sector:6
launched:6
scatter:6
superposition:6
downward:6
david:6
attitude:6
january:6
pivotal:6
indical:6
tm:6
discs:6
bounds:6
sectorial:6
carrier:6
reinforced:6
influenced:6
center-of-gravity:6
cycling:6
breathing:6
flat-panel:6
cycles:6
gusts:6
bomber:6
shanley:6
groups:6
eccentricities:6
walled:6
unpressurized:6
wavelength:6
rheological:6
fail:6
injectant:6
vessels:6
springs:6
perpendicularly:6
blade-to-blade:6
splitter:6
jet-off:6
retrorocket:6
.5:6
created:6
milliseconds:6
loose:6
wetted:6
curvilinear:6
low-subsonic:6
simply-supported:6
manchester:5
yen:5
k.t:5
in.:5
devices:5
orders:5
grows:5
impossible:5
mane

nonparallel:3
hoshizaki:3
lightweight:3
secondly:3
associate:3
tan:3
359:3
analyze:3
361:3
london:3
ser:3
362:3
h.n:3
kutta-joukowski:3
potential-flow:3
two-point:3
optimization:3
simulators:3
tips:3
giese:3
bergdolt:3
memorandum:3
met:3
10-:3
ames:3
zoom:3
stewartson-illingworth:3
theorems:3
reverse-flow:3
non-steady:3
bending-torsion:3
381:3
displayed:3
1052:3
387:3
justification:3
employs:3
irreversible:3
budiansky:3
j.g.:3
ias:3
nitrogen-atom:3
concentrations:3
y.a:3
sink:3
405:3
joseph:3
indications:3
conflict:3
410:3
fore:3
intrinsic:3
council:3
f.f:3
preferred:3
stressed:3
images:3
inflection:3
494:3
lab:3
bears:3
cal:3
variable-geometry:3
adjustable:3
m.e:3
flown:3
image:3
line-vortex:3
ultimate:3
beckwith:3
warren:3
abruptly:3
440:3
stall-flutter:3
instances:3
reduced-frequency:3
woolston:3
midspan:3
20degree:3
of-freedom:3
jastrow:3
pearse:3
449:3
yielding:3
interpretations:3
propeller-driven:3
interacts:3
collocation:3
re-examination:3
thirteen:3
formally:3
adjusted:3
tani:3

closing:2
reply:2
radically:2
558:2
statistics:2
0.500-in.-diameter:2
mm:2
transpiration-cooling:2
graham:2
771:2
loosely:2
562:2
empirically:2
penetrates:2
chiefly:2
scalar:2
565:2
3.9:2
h.t:2
kendall:2
5-dash:2
6,000:2
/4/:2
ultra-high:2
2.35:2
826:2
hertzberg:2
0.15:2
573:2
high-altitude:2
bray:2
k.n.c:2
begun:2
'freezing:2
1962.:2
complete-compressor-stall:2
abrupt-stall:2
stage-matching:2
chemistry:2
multiple-valued:2
intermediate-speed:2
e56b03b:2
xiii:2
heat-flow:2
/associated:2
lecturer:2
cranfield:2
project:2
goodman:2
583:2
fluid-flow:2
634:2
yang:2
publications:2
1273:2
591:2
637:2
pressure-ratio:2
a.r:2
h.g.:2
852:2
596:2
pivots:2
woodgate:2
scruton:2
/core/:2
lay-out:2
680:2
quick-acting:2
valve:2
insulation:2
bottom:2
seconds:2
2739:2
noticed:2
monaghan:2
/thin/:2
607:2
lundgen:2
mass-flow:2
collects:2
subscripts:2
d.m.c:2
e/0.2/:2
ellipticity:2
summarize:2
5-10:2
615:2
/0.2:2
.and:2
semi-major:2
the'solar:2
ultra-violet:2
averages:2
discoverer:2
disappears:2
k.m:2
1009:2

belong:1
/exceptional:1
hislop:1
g.s:1
proc.cam.phil.s:1
superposed:1
eighth:1
ninth:1
156:1
t.h:1
100,1922,499:1
shallower:1
quotation:1
/this:1
trochoidal:1
weitbrecht:1
roy.s:1
triple-valuedness:1
inverts:1
158:1
heisler:1
m.p:1
historics:1
trial-and-error:1
dusinberre:1
insure:1
160:1
r-15:1
riolate:1
r-1:1
162:1
r-3:1
contributing:1
retrovelocity:1
opposition:1
34.5:1
163:1
r-55:1
conveniently:1
prescribing:1
broadened:1
10-earth-g:1
hitting:1
heliocentric:1
planeto-centric:1
technologically:1
voyages:1
lift-off:1
r-11:1
introduring:1
precisely:1
sanger:1
slowing:1
165:1
r-26:1
obtainec:1
0.11:1
0.32:1
schultz-grunow:1
kempf:1
/univeral:1
constants/:1
appraoch:1
uncorrected:1
inexpensive:1
slighlty:1
166:1
r117:1
non-heat-conducting:1
amongst:1
'reservoir:1
dispersion:1
concisely:1
'oxygen-like:1
n102:1
de-exciting:1
eucken-value:1
distorting:1
/boundary:1
layers/:1
hirschfelder:1
cell:1
eucken:1
b.sc.:1
a.f.r.ae.s:1
collisions:1
tm.1418:1
verifies:1
identifying:1
171:1
sherman:1


488:1
leonard:1
linearizing:1
announces:1
co.:1
msvo:1
et.al.:1
air-flows:1
processes/:1
30th:1
annual:1
489:1
mentioning:1
quote:1
unhappy:1
justifications:1
ber:1
surely:1
letting:1
typography:1
eqs:1
confusing:1
typographical:1
heading:1
490:1
j.ae.scs.29:1
magnetic-field:1
491:1
burton:1
describable:1
concise:1
firmer:1
492:1
earl:1
keener:1
493:1
slug:1
h':1
pr:1
exceptions:1
borcher:1
e.f:1
496:1
497:1
arithmetic:1
498:1
499:1
greensite:1
traveled:1
linearity:1
constancy:1
con-damped:1
sinusoids:1
coefstant:1
exponentially-ficients:1
spoken:1
communication:1
laguerre:1
nice:1
puzzled:1
facilitates:1
freeing:1
idealizing:1
refine:1
/reverse/:1
compete:1
viewing:1
/afresh/:1
revolution/:1
/standardize/:1
ridyard:1
edelfelt:1
502:1
1962,752:1
503:1
drag-rise:1
onset-of-:1
separation-effects:1
redefinition:1
2.18:1
505:1
lyons:1
navy-sponsored:1
aeroballistics:1
506:1
brandmaier:1
h.e:1
quest:1
transportation:1
focused:1
operates:1
gravity-wave:1
/cushion/:1
digital-computer:1
havelo

802:1
neighbourhoods:1
804:1
mullens:1
afftc-tn-56-20:1
dev:1
u.s.a.f.:1
/sonic:1
boom/:1
popularly:1
widespread:1
hopes:1
booming:1
f-100:1
authority:1
directive:1
5524-f1:1
805:1
tn.d48:1
fighter-type:1
wooded:1
soundings:1
radar-tracking:1
tailwind:1
ground-reflection:1
objectionable:1
cracking:1
plate-glass:1
806:1
lina:1
tn.d235:1
d-48:1
807:1
tn.d880:1
ground-pressure:1
1.24:1
1.52:1
gross-weight:1
83,000:1
808:1
tn.d161:1
bypass:1
boom-producing:1
nonaxisymmetrical:1
far-field:1
tn.d881:1
4-:1
bow-shock:1
810:1
3-4-59l:1
subjective:1
811:1
=pressure:1
whichever:1
812:1
l58a16:1
fineness-:1
ratio-been:1
2.15:1
813:1
tn.3737:1
pulse-rocket:1
814:1
tn.3788:1
first-:1
fisher:1
l.r:1
dicamillo:1
1-21-59l:1
1.74:1
2.78:1
segmented:1
clamshell-shaped:1
served:1
dissimilar:1
phasing:1
816:1
letko:1
1-18-159l:1
3.11:1
817:1
hankelman:1
prager:1
meant:1
occasion:1
818:1
hodge:1
prandtl-reuss:1
hencky:1
magasarian:1
o.l:1
acceptability:1
820:1
rests:1
hampered:1
nonexistence:1
legitimate:1

1134:1
edition:1
proof-testing:1
diameter-thickness:1
psi:1
undergone:1
safe:1
interim:1
exercise:1
prism:1
rigidly:1
unflanged:1
1136:1
np:1
0.06:1
0.08:1
0.14:1
0.16:1
toriconic1a:1
complement:1
1137:1
nonhomogeneous:1
langer:1
trans.amer.math.soc.33:1
wissler:1
dissertation:1
observes:1
jointly:1
occuring:1
beskin:1
1139:1
randomly:1
diffusely:1
gage:1
satelite:1
1140:1
chaudhury:1
apropriate:1
hays:1
1141:1
zeisberg:1
time-wise:1
1142:1
tirumalesa:1
satyanarayana:1
condensed:1
1143:1
b.e:1
tn.d1428:1
combustion-heated:1
3.25:1
1144:1
newsom:1
louis:1
d-1382:1
multiple-propeller:1
stronger:1
deeper:1
intensification:1
intensified:1
struck:1
recirculating:1
1145:1
don:1
brush:1
bo:1
1146:1
melvin:1
1147:1
jackson:1
stalder:1
jukoff:1
plate.:1
daytime:1
predominating:1
finely:1
insuring:1
boattailing:1
skin-cooling:1
lost:1
1148:1
demarcus:1
hopper:1
carbide:1
chemicals:1
k-25:1
tennessee:1
reinvestigated:1
1149:1
dragutin:1
stojanovic:1
beograd:1
jugoslavia:1
numberical:1
temprature:

### Stemming and Stopwords Removal

In [8]:
def text_process(content):
    stop_char_list = ['.', '.', '.A', '.B', '.I', '.T', '(', ')', 'j']
    stop_list_30 = [
        'the',
        'of',
        'and',
        'a',
        'in',
        'to',
        'is',
        'for',
        'with',
        'are',
        'on',
        'at',
        'by',
        'that',
        'an',
        'be',
        'from',
        'as',
        'this',
        'which',
        'it',
        'was',
        'been',
        'were',
        'these',
        'or',
        'has',
        'made',
        'can',
        'two'
    ]
    stop_list = stop_char_list+stop_list_30
    stem = PorterStemmer()

    content = content.lstrip()
    content = re.sub('[^A-Za-z]+', ' ', content)
    tokens = nltk.word_tokenize(content)
  
    stem_tokens = [stem.stem(word) for word in tokens if word not in stop_list]
    clean_stem_tokens = ' '.join(map(str,  stem_tokens))

    shortword = re.compile(r'\W*\b\w{1,2}\b')
    clean_stem_tokens = shortword.sub('', clean_stem_tokens)

    
    return clean_stem_tokens


In [9]:
# save processed doc
dir = os.listdir('./cran/splitfiles/')
for d in dir:
    with open('./cran/splitfiles/' + d, 'r') as f:
        content = f.readlines()
        flag = False
        content_ = ''
        for line in content:
            if line.startswith('.W'):
                flag = True
            if flag == True:
                if line.startswith('.W'):
                    continue
                else:
                    content_ = content_ + line
        new_content = text_process(content_)
        with open('./cran/processed_doc/' + d, 'w') as f1:
            f1.write(new_content)
            f1.close()
    f.close()

In [10]:
!tree

[01;34m.[0m
├── [00mPipeline.ipynb[0m
├── [00mREADME.md[0m
├── [01;34mcran[0m
│   ├── [00mcran.all.1400[0m
│   ├── [00mcran.qry[0m
│   ├── [00mcranqrel[0m
│   ├── [00mcranqrel.readme[0m
│   ├── [01;34mprocessed_doc[0m
│   │   ├── [00m1.txt[0m
│   │   ├── [00m10.txt[0m
│   │   ├── [00m100.txt[0m
│   │   ├── [00m1000.txt[0m
│   │   ├── [00m1001.txt[0m
│   │   ├── [00m1002.txt[0m
│   │   ├── [00m1003.txt[0m
│   │   ├── [00m1004.txt[0m
│   │   ├── [00m1005.txt[0m
│   │   ├── [00m1006.txt[0m
│   │   ├── [00m1007.txt[0m
│   │   ├── [00m1008.txt[0m
│   │   ├── [00m1009.txt[0m
│   │   ├── [00m101.txt[0m
│   │   ├── [00m1010.txt[0m
│   │   ├── [00m1011.txt[0m
│   │   ├── [00m1012.txt[0m
│   │   ├── [00m1013.txt[0m
│   │   ├── [00m1014.txt[0m
│   │   ├── [00m1015.txt[0m
│   │   ├── [00m1016.txt[0m
│   │   ├── [00m1017.txt[0m
│   │   ├── [00m1018.txt[0m
│   │   ├── [00m1019.txt[0m
│   │   ├── [00m1

In [11]:
!cat ./cran/processed_doc/1.txt

experiment investig aerodynam wing slipstream experiment studi wing propel slipstream order determin spanwis distribut lift increas due slipstream differ angl attack wing differ free stream slipstream veloc ratio result intend part evalu basi differ theoret treatment problem compar span load curv togeth support evid show substanti part lift increment produc slipstream due destal boundari layer control effect integr remain lift increment after subtract destal lift found agre well potenti flow theori empir evalu destal effect specif configur experi

## Case Folding

In [12]:
dir = os.listdir('./cran/processed_doc/')
for d in dir:
    with open('./cran/processed_doc/' + d, 'r') as f:
        content = f.readlines()
        content_ = ''
        for line in content:
            content_ = content_ + line.lower()
        with open('./cran/processed_doc/' + d, 'w') as f1:
            f1.write(content_)
            f1.close()
    f.close()

In [13]:
!cat ./cran/processed_doc/1.txt

experiment investig aerodynam wing slipstream experiment studi wing propel slipstream order determin spanwis distribut lift increas due slipstream differ angl attack wing differ free stream slipstream veloc ratio result intend part evalu basi differ theoret treatment problem compar span load curv togeth support evid show substanti part lift increment produc slipstream due destal boundari layer control effect integr remain lift increment after subtract destal lift found agre well potenti flow theori empir evalu destal effect specif configur experi

## Terms to be included in the dictionary

In [14]:
dir = os.listdir('./cran/processed_doc/')
vocab = []
for d in dir:
    with open('./cran/processed_doc/1.txt', 'r') as f:
        content = f.read()
        word_tokens = nltk.word_tokenize(content)
        fdist_mis = nltk.FreqDist(word_tokens)
    for i in fdist_mis:
        vocab.append(i)

In [15]:
vocab

['slipstream',
 'lift',
 'wing',
 'differ',
 'destal',
 'experiment',
 'due',
 'part',
 'evalu',
 'increment',
 'effect',
 'investig',
 'aerodynam',
 'studi',
 'propel',
 'order',
 'determin',
 'spanwis',
 'distribut',
 'increas',
 'angl',
 'attack',
 'free',
 'stream',
 'veloc',
 'ratio',
 'result',
 'intend',
 'basi',
 'theoret',
 'treatment',
 'problem',
 'compar',
 'span',
 'load',
 'curv',
 'togeth',
 'support',
 'evid',
 'show',
 'substanti',
 'produc',
 'boundari',
 'layer',
 'control',
 'integr',
 'remain',
 'after',
 'subtract',
 'found',
 'agre',
 'well',
 'potenti',
 'flow',
 'theori',
 'empir',
 'specif',
 'configur',
 'experi',
 'slipstream',
 'lift',
 'wing',
 'differ',
 'destal',
 'experiment',
 'due',
 'part',
 'evalu',
 'increment',
 'effect',
 'investig',
 'aerodynam',
 'studi',
 'propel',
 'order',
 'determin',
 'spanwis',
 'distribut',
 'increas',
 'angl',
 'attack',
 'free',
 'stream',
 'veloc',
 'ratio',
 'result',
 'intend',
 'basi',
 'theoret',
 'treatment',
 'p

## Generate (term, docID) pairs

In [16]:
def create_vector_index(path):
    inverted_index = {} 
    dir = os.listdir(path)
    no_docs = len(dir)
    for d in dir:
        id = d.split('.')[0]
        f = open(path + d, 'r')
        content = f.read()
        f.close()
        tokens = nltk.word_tokenize(content)
        for token in tokens:
            if token not in inverted_index:
                inverted_index[token] = {}
            if id not in inverted_index[token]:
                inverted_index[token][id]= 1
            else:
                inverted_index[token][id] += 1
    return inverted_index

In [17]:
index = create_vector_index('./cran/processed_doc/')
print(index)
f = open('./inverted_index.txt', 'w')
f.write(json.dumps(index))
f.close()

{'spheric': {'1053': 2, '262': 1, '101': 1, '263': 1, '1291': 1, '739': 2, '1327': 2, '688': 2, '1136': 1, '449': 1, '448': 1, '110': 2, '1080': 1, '162': 1, '149': 1, '613': 2, '1344': 1, '954': 1, '1234': 1, '830': 2, '401': 1, '831': 1, '827': 4, '614': 2, '370': 1, '1140': 1, '826': 1, '761': 1, '80': 1, '341': 1, '828': 3, '626': 1, '1205': 1, '44': 1, '423': 5, '1001': 1, '294': 2}, 'cap': {'1053': 3, '101': 1, '1136': 1, '831': 1, '95': 3, '553': 1, '626': 1, '294': 1}, 'snap': {'1053': 1, '957': 1}, 'nonlinear': {'1053': 1, '1132': 1, '14': 1, '467': 2, '1295': 1, '1056': 2, '1255': 1, '299': 1, '1030': 1, '201': 3, '62': 1, '1220': 2, '824': 2, '830': 2, '367': 1, '1181': 1, '164': 2, '831': 2, '833': 1, '827': 2, '585': 3, '1366': 1, '584': 1, '1012': 1, '579': 1, '418': 1, '395': 2, '801': 1, '593': 2, '1361': 2, '743': 3, '1000': 2, '1259': 1, '1073': 1, '1059': 2, '863': 1, '1272': 1, '1060': 1, '132': 1, '130': 4, '468': 1, '24': 1, '284': 1}, 'boundari': {'1053': 2, '104

# THE END #