Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 429 lines (329 sloc) 15.407 kB
1c02f0c Bio.sequtils and Bio.SeqUtils were duplicated code, and even worse we…
chapmanb authored
1 This file provides documentation for modules in Biopython that have been moved
2 or deprecated in favor of other modules. This provides some quick and easy
3 to find documentation about how to update your code to work again.
4
2669c4d - Updates to move from Numeric python to NumPy. Python modules have b…
chapmanb authored
5 Numeric support
6 ===============
41a4497 @peterjc Declaring Bio.Transcribe and Bio.Translate as obsolete and likely to …
peterjc authored
7 Following the release of 1.48, Numeric support in Biopython is discontinued.
8 Please move to NumPy.
2669c4d - Updates to move from Numeric python to NumPy. Python modules have b…
chapmanb authored
9
cfd18f5 @peterjc Making the Seq object's .data into a new style property with a warnin…
peterjc authored
10 Bio.Seq
11 =======
12 Direct use of the Seq object (and MutableSeq object) .data property is discouraged.
13 As of release 1.49, writing to the Seq object's .data property triggers a warning,
14 and this property is likely to be made read only in the next release.
15
41a4497 @peterjc Declaring Bio.Transcribe and Bio.Translate as obsolete and likely to …
peterjc authored
16 Bio.Transcribe and Bio.Translate
17 ================================
18 Declared obsolete in Release 1.49.
19 Please use the methods or functions in Bio.Seq instead.
20
fb76593 @peterjc Labelling Martel and Bio.Mindy as obsolete, updating the news about t…
peterjc authored
21 Martel
22 ======
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
23 Declared obsolete in Release 1.48, deprecated in Release 1.49.
fb76593 @peterjc Labelling Martel and Bio.Mindy as obsolete, updating the news about t…
peterjc authored
24
25 Bio.Mindy
26 =========
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
27 Declared obsolete in Release 1.48, deprecated in Release 1.49.
76300d6 @peterjc Updates include deprecation of Martel/Mindy
peterjc authored
28
1b6c598 @peterjc Deprecating Bio.DBXRef which was used in the Bio.builders Martel pars…
peterjc authored
29 Bio.builders, Bio.Std, Bio.StdHandler, Bio.Decode and Bio.DBXRef
30 ================================================================
31 Part of the Martel/Mindy parsing infrastructure, these were deprecated in
32 Release 1.49.
fb76593 @peterjc Labelling Martel and Bio.Mindy as obsolete, updating the news about t…
peterjc authored
33
fd2e06b @peterjc Adding minimal module docstrings for the deprecated Bio.Writers/Bio.w…
peterjc authored
34 Bio.Writer and Bio.writers
35 ==========================
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
36 Deprecated in Release 1.48.
fd2e06b @peterjc Adding minimal module docstrings for the deprecated Bio.Writers/Bio.w…
peterjc authored
37
6ce5052 @peterjc Bio.Emboss.Primer was deprecated in Biopython 1.48
peterjc authored
38 Bio.Emboss.Primer
39 =================
40 Deprecated in Release 1.48, this parser was replaced by Bio.Emboss.Primer3 and
41 Bio.Emboss.PrimerSearch instead.
42
fb76593 @peterjc Labelling Martel and Bio.Mindy as obsolete, updating the news about t…
peterjc authored
43 Bio.MetaTool
44 ============
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
45 Deprecated in Release 1.48, this was a parser for the output of MetaTool 3.5
fb76593 @peterjc Labelling Martel and Bio.Mindy as obsolete, updating the news about t…
peterjc authored
46 which is now obsolete.
47
81015eb @peterjc Declaring Bio.PubMed and the online parts of Bio.GenBank as OBSOLETE,…
peterjc authored
48 Bio.GenBank
49 ===========
50 The online functionality (search_for, download_many, and NCBIDictionary) was
51 declared obsolete in Release 1.48, with the intention of an official deprecation
52 in the following release. Please use Bio.Entrez instead.
53
54 Bio.PubMed
55 ==========
558b4c0 @peterjc Deprecating Bio.PubMed in favour of Bio.Entrez
peterjc authored
56 Declared obsolete in Release 1.48, deprecated in Release 1.49.
57 Please use Bio.Entrez instead.
81015eb @peterjc Declaring Bio.PubMed and the online parts of Bio.GenBank as OBSOLETE,…
peterjc authored
58
01e6e76 @peterjc Bio.EUtils deprecated in favour of Bio.Entrez
peterjc authored
59 Bio.EUtils
60 ==========
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
61 Deprecated in favor of Bio.Entrez in Release 1.48.
01e6e76 @peterjc Bio.EUtils deprecated in favour of Bio.Entrez
peterjc authored
62
fa7ab2d @peterjc Updating for recent deprecations
peterjc authored
63 Bio.Blast.NCBIWWW
64 =================
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
65 The HTML BLAST parser was deprecated as of Release 1.48.
66 The deprecated functions blast and blasturl were removed in Release 1.44.
fa7ab2d @peterjc Updating for recent deprecations
peterjc authored
67
68 Bio.Saf
69 =======
70 Deprecated as of Release 1.48, as it appears to have no users, and relies
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
71 on Martel which doesn't work properly with mxTextTools 3.0.
fa7ab2d @peterjc Updating for recent deprecations
peterjc authored
72
ad46521 @peterjc Deprecating Bio.NBRF in favour of the 'pir' format in Bio.SeqIO
peterjc authored
73 Bio.NBRF
74 ========
75 Deprecated as of Release 1.48 in favor of the "pir" format in Bio.SeqIO
76
5be4221 @peterjc Deprecating Bio.IntelliGenetics in favour of the ig format in Bio.SeqIO
peterjc authored
77 Bio.IntelliGenetics
78 ===================
fa7ab2d @peterjc Updating for recent deprecations
peterjc authored
79 Deprecated as of Release 1.48 in favor of the "ig" format in Bio.SeqIO
d01c450 Getting ready for release 1.46.
mdehoon authored
80
890dada @peterjc Removing deprecated Bio.SeqIO submodules (code was moved under Bio.Al…
peterjc authored
81 Bio.SeqIO submodules PhylipIO, ClustalIO, NexusIO and StockholmIO
82 =================================================================
83 You can still use the "phylip", "clustal", "nexus" and "stockholm" formats
84 in Bio.SeqIO, however these are now supported via Bio.AlignIO, with the
85 old code deprecated in Releases 1.46 or 1.47, and removed in Release 1.49.
86
5e507c9 Updating for release 1.47.
mdehoon authored
87 Bio.ECell
88 =========
89 Deprecated as of Release 1.47, as it appears to have no users, and the code
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
90 does not seem relevant for ECell 3. Removed in Release 1.49.
5e507c9 Updating for release 1.47.
mdehoon authored
91
3f5ba50 @peterjc Removing Bio.LocusLink which was deprecated in Biopython 1.45 -- the …
peterjc authored
92 Bio.LocusLink
93 =============
94 Deprecated as of Release 1.45, removed in Release 1.49.
95 The NCBI's LocusLink was superseded by Entrez Gene.
96
edb3ac6 @peterjc Removing modules Bio.SGMLExtractor, Bio.CDD, Bio.Gobase and Bio.Rebas…
peterjc authored
97 Bio.SGMLExtractor
98 =================
99 Deprecated as of Release 1.46, removed in Release 1.49.
100
d01c450 Getting ready for release 1.46.
mdehoon authored
101 Bio.Rebase
102 ==========
edb3ac6 @peterjc Removing modules Bio.SGMLExtractor, Bio.CDD, Bio.Gobase and Bio.Rebas…
peterjc authored
103 Deprecated as of Release 1.46, removed in Release 1.49.
d01c450 Getting ready for release 1.46.
mdehoon authored
104
105 Bio.Gobase
106 ==========
edb3ac6 @peterjc Removing modules Bio.SGMLExtractor, Bio.CDD, Bio.Gobase and Bio.Rebas…
peterjc authored
107 Deprecated as of Release 1.46, removed in Release 1.49.
d01c450 Getting ready for release 1.46.
mdehoon authored
108
109 Bio.CDD
110 =======
edb3ac6 @peterjc Removing modules Bio.SGMLExtractor, Bio.CDD, Bio.Gobase and Bio.Rebas…
peterjc authored
111 Deprecated as of Release 1.46, removed in Release 1.49.
d01c450 Getting ready for release 1.46.
mdehoon authored
112
21059b1 @peterjc Bio.biblio was deprecated for Biopython 1.45, but I didn't remember t…
peterjc authored
113 Bio.biblio
114 ==========
b927439 @peterjc Bio.WWW deprecation, and updating old entries to say when they were r…
peterjc authored
115 Deprecated as of Release 1.45, removed in Release 1.48
21059b1 @peterjc Bio.biblio was deprecated for Biopython 1.45, but I didn't remember t…
peterjc authored
116
4556db2 @peterjc Bringing these up to date with changes since Biopython 1.44
peterjc authored
117 Bio.WWW
118 =======
b927439 @peterjc Bio.WWW deprecation, and updating old entries to say when they were r…
peterjc authored
119 The modules under Bio.WWW were deprecated in Release 1.45, and removed in 1.48.
120 The remaining stub Bio.WWW was deprecated in Release 1.48.
121
4556db2 @peterjc Bringing these up to date with changes since Biopython 1.44
peterjc authored
122 The functionality in Bio.WWW.SCOP, Bio.WWW.InterPro and Bio.WWW.ExPASy
123 is now available from Bio.SCOP, Bio.InterPro and Bio.ExPASy instead.
124
5145a4d @peterjc Bringing this up to date for Biopython 1.44
peterjc authored
125 Bio.SeqIO
126 =========
127 The old Bio.SeqIO.FASTA and Bio.SeqIO.generic were deprecated in favour of
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
128 the new Bio.SeqIO module as of Release 1.44, removed in Release 1.47.
5145a4d @peterjc Bringing this up to date for Biopython 1.44
peterjc authored
129
fe10992 @peterjc Mentioning a few old modules deprecated in 1.44 and removed in 1.46
peterjc authored
130 Bio.Medline.NLMMedlineXML
131 =========================
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
132 Deprecated in Release 1.44, removed in 1.46.
fe10992 @peterjc Mentioning a few old modules deprecated in 1.44 and removed in 1.46
peterjc authored
133
134 Bio.MultiProc
135 =============
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
136 Deprecated in Release 1.44, removed in 1.46.
fe10992 @peterjc Mentioning a few old modules deprecated in 1.44 and removed in 1.46
peterjc authored
137
138 Bio.MarkupEditor
139 ================
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
140 Deprecated in Release 1.44, removed in 1.46.
fe10992 @peterjc Mentioning a few old modules deprecated in 1.44 and removed in 1.46
peterjc authored
141
5145a4d @peterjc Bringing this up to date for Biopython 1.44
peterjc authored
142 Bio.lcc
143 =======
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
144 Deprecated in favor of Bio.SeqUtils.lcc in Release 1.44, removed in 1.46.
5145a4d @peterjc Bringing this up to date for Biopython 1.44
peterjc authored
145
146 Bio.crc
147 =======
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
148 Deprecated in favor of Bio.SeqUtils.CheckSum in Release 1.44, removed in 1.46.
5145a4d @peterjc Bringing this up to date for Biopython 1.44
peterjc authored
149
150 Bio.FormatIO
151 ============
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
152 This was removed in Release 1.44 (a deprecation was not possible).
5145a4d @peterjc Bringing this up to date for Biopython 1.44
peterjc authored
153
fa7ab2d @peterjc Updating for recent deprecations
peterjc authored
154 Bio.expressions (and therefore Bio.config, Bio.dbdefs, Bio.formatdefs, Bio.dbdefs)
5145a4d @peterjc Bringing this up to date for Biopython 1.44
peterjc authored
155 ===============
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
156 These were deprecated in Release 1.44, and removed in Release 1.49.
5145a4d @peterjc Bringing this up to date for Biopython 1.44
peterjc authored
157
158 Bio.Kabat
159 =========
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
160 This was deprecated in Release 1.43 and removed in Release 1.44.
5145a4d @peterjc Bringing this up to date for Biopython 1.44
peterjc authored
161
34b4f31 Added the functions 'complement' and 'reverse_complement' to Bio.Seq'…
mdehoon authored
162 Bio.SeqUtils
163 ============
164 The functions 'complement' and 'antiparallel' in Bio.SeqUtils have been
76300d6 @peterjc Updates include deprecation of Martel/Mindy
peterjc authored
165 deprecated as of Release 1.31, and removed in Release 1.43.
166 Use the functions 'complement' and 'reverse_complement' in Bio.Seq instead.
34b4f31 Added the functions 'complement' and 'reverse_complement' to Bio.Seq'…
mdehoon authored
167
168 Bio.GFF
169 =======
170 The functions 'forward_complement' and 'antiparallel' in Bio.GFF.easy have been
76300d6 @peterjc Updates include deprecation of Martel/Mindy
peterjc authored
171 deprecated as of Release 1.31, and removed in Release 1.43.
172 Use the functions 'complement' and 'reverse_complement' in Bio.Seq instead.
efd9b60 Added blast to qblast change to DEPRECATED file
chapmanb authored
173
1c02f0c Bio.sequtils and Bio.SeqUtils were duplicated code, and even worse we…
chapmanb authored
174 Bio.sequtils
b0acc00 Added instructions on how to move to Bio.Cluster from Bio.kMeans and
mdehoon authored
175 ============
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
176 Deprecated as of Release 1.30, removed in Release 1.42.
1c02f0c Bio.sequtils and Bio.SeqUtils were duplicated code, and even worse we…
chapmanb authored
177 Use Bio.SeqUtils instead.
b0acc00 Added instructions on how to move to Bio.Cluster from Bio.kMeans and
mdehoon authored
178
909bae9 Deprecated Bio.SVM and recommend usage of libsvm.
chapmanb authored
179 Bio.SVM
180 =======
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
181 Deprecated as of Release 1.30, removed in Release 1.42.
909bae9 Deprecated Bio.SVM and recommend usage of libsvm.
chapmanb authored
182 The Support Vector Machine code in Biopython has been superceeded by a
183 more robust (and maintained) SVM library, which includes a python
184 interface. We recommend using LIBSVM:
185
186 http://www.csie.ntu.edu.tw/~cjlin/libsvm/
b0acc00 Added instructions on how to move to Bio.Cluster from Bio.kMeans and
mdehoon authored
187
23b046b Removed internal references to RecordFile, which are really not needed.
chapmanb authored
188 Bio.RecordFile
189 ==============
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
190 Deprecated as of Release 1.30, removed in Release 1.42.
23b046b Removed internal references to RecordFile, which are really not needed.
chapmanb authored
191 RecordFile wasn't completely implemented and duplicates the work
41a4497 @peterjc Declaring Bio.Transcribe and Bio.Translate as obsolete and likely to …
peterjc authored
192 of most standard parsers.
23b046b Removed internal references to RecordFile, which are really not needed.
chapmanb authored
193
b0acc00 Added instructions on how to move to Bio.Cluster from Bio.kMeans and
mdehoon authored
194 Bio.kMeans and Bio.xkMeans
195 ==========================
bbfff83 @peterjc Removing Bio.ECell which was deprecated in Biopython 1.47 (also added…
peterjc authored
196 Deprecated as of Release 1.30, removed in Release 1.42.
b0acc00 Added instructions on how to move to Bio.Cluster from Bio.kMeans and
mdehoon authored
197
198 The k-Means algorithm is an algorithm for unsupervised clustering of data.
199 Biopython includes an implementation of the k-means clustering algorithm
200 in kMeans.py. Recently, a larger set of clustering algorithms entered
201 Biopython as Bio.Cluster. As the kcluster routine in Bio.Cluster also implements
202 the k-means clustering algorithm, the kMeans.py module has been deprecated.
203 Below you will find a description of how to switch from kMeans.py to
204 Bio.Cluster's kcluster.
205
206 The function kcluster in Bio.Cluster performs k-means or k-medians clustering.
207 The corresponding function in kMeans.py is called cluster. This function takes
208 the following arguments:
209
210 o data
211 o k
212 o distance_fn
213 o init_centroids_fn
214 o calc_centroid_fn
215 o max_iterations
216 o update_fn
217
218 The function kcluster in Bio.Cluster takes the following arguments:
219
220 o data
221 o nclusters
222 o mask
223 o weight
224 o transpose
225 o npass
226 o method
227 o dist
228 o initialid
229
230
231 Arguments for kMeans.py's cluster, and their equivalents in Bio.Cluster
232 -----------------------------------------------------------------------
233
234
235 o data:
236
237 In kMeans.py, data is a list of vectors, each containing the same number of
238 data points. Within the context of clustering genes based on their gene
239 expression values, each vector would correspond to the gene expression data of
240 one particular gene, and the values in the vector would correspond to the
241 measured gene expression value by the different microarrays. The cluster
242 routine in kMeans.py always performs a row-wise clustering by grouping vectors.
243
244 The argument data to Bio.Cluster's kcluster has the same structure as in
245 kMeans.py. However, Bio.Cluster allows row-wise and column-wise clustering by
246 the transpose argument. If transpose==0 (the default value), kcluster performs
247 row-wise clustering, consistent with kMeans.py. If transpose==1, kcluster
248 performs column-wise clustering. The same behavior can be obtained, of course,
249 by transposing the data array before calling kcluster.
250
251
252 o k:
253
254 The desired number of clusters is specified by the input argument k in
255 kMeans.py. The corresponding argument in Bio.Cluster's kcluster is nclusters.
256
257 o distance_fn:
258
259 In kMeans.py, the argument distance_fn represents the distance function to
260 calculate the distances between items and cluster centroids. This argument
261 corresponds to a true Python function. The default value is the Euclidean
262 distance, implemented as distance.euclidean in distance.py. User-defined
263 distance functions can also be used.
264
265 The k-means routine in Bio.Cluster does not allow user-specified distance
266 functions. Instead, it provides the following nine built-in distance functions,
267 depending on the argument dist:
268
269 dist=='e': Euclidean distance
270 dist=='h': Harmonically summed Euclidean distance
271 dist=='b': City-block distance
272 dist=='c': Pearson correlation
273 dist=='a': absolute value of the Pearson correlation
274 dist=='u': uncentered correlation
275 dist=='x': absolute uncentered correlation
276 dist=='s': Spearmans rank correlation
277 dist=='k': Kendalls tau
278
279 User-defined distance functions are possible only by modifying the C code in
280 cluster.c (which may not be as hard as it sounds). The default distance function
281 is the Euclidean distance (distance=='e'). Note that in Bio.Cluster the
282 Euclidean distance is defined as the sum of squared differences, whereas in
283 kMeans.py the square root of this quantity is taken. This does not affect the
284 clustering result.
285
286 o init_centroids_fn:
287
288 This function specifies the initial choice for the cluster centroids. By
289 default, cluster in kMeans.py uses a random initial choice of cluster centroids
290 by randomly choosing k data vectors from the input vectors in the data input
291 argument. Alternatively, the user can specify a user-defined function to choose
292 the initial cluster centroids.
293
294 In Bio.Cluster, the k-means algorithm in kcluster starts from an initial cluster
295 assignment instead of an initial choice of cluster centroids. As far as I know,
296 these two initialization methods are equivalent in practice. Similar to the
297 cluster routine in kMeans.py, Bio.Cluster's kcluster performs a random initial
298 assignment of items to clusters. Alternatively, users can specify a
299 (deterministic) initial clustering via the initialid argument. This argument is
300 None by default. If not None, it should be a 1D array (or list) containing the
301 number (between 0 and nclusters-1) of the cluster to which each item is
302 assigned initially.
303
304 Note that the k-means routine in Bio.Cluster performs automatic repeats of the
305 algorithm, each time starting from a different random initial clustering. See
306 the comment for the npass argument below.
307
308 o calc_centroid_fn:
309
310 This argument specifies how to calculate the cluster centroids, given the data
311 vectors of the items that belong to each cluster. By default, the mean over the
312 vectors is calculated. A user-defined function can also be used.
313
314 Bio.Cluster's kcluster does not allow user-defined functions. Instead, the
315 method to calculate the cluster centroid is determined by the argument method,
316 which can be either 'a' (arithmetic mean) or 'm' (median). The default is to
317 calculate the mean ('a').
318
319 o max_iterations:
320
321 The cluster routine in kMeans.py has an argument max_iterations, which is used
322 to stop the iteration it the routine does not converge after the given number of
323 iterations.
324
325 The kcluster routine in Bio.Cluster does not have such an argument. The failure
326 of a k-means algorithm to converge is due to the occurrence of periodic
327 clustering solutions during the course of the k-means algorithm. The kcluster
328 routine in Bio.Cluster automatically checks for the occurrence of such a
329 periodicity in the solutions. If a periodic behavior is detected, the algorithm
330 is interrupted and the last clustering solution is returned. Accordingly, the
331 kcluster routine is guaranteed to return a clustering solution. Also see the
332 discussion of the npass argument below.
333
334 o update_fn:
335
336 The argument update_fn to cluster in kMeans.py is a hook function that is
337 called at the beginning of every iteration and passed the iteration number,
338 cluster centroids, and current cluster assignments. It is used by xkMeans.py,
339 which provides a visualization of k-means clustering. Currently there is no
340 equivalent in Bio.Cluster.
341
342
343 Other arguments for Bio.Cluster's kcluster.
344 -------------------------------------------
345
346 Three arguments in Bio.Cluster's kcluster do not have a direct equivalent in
347 kMeans.py's cluster.
348
349 o mask:
350
351 Microarray experiments tend to suffer from a large number of missing data. The
352 argument mask to Bio.Cluster's kcluster lets the user specify which data are
353 missing. This argument is an array with the same shape as data, and contains
354 a 1 for each data point that is present, and a 0 for a missing data point:
355
356 mask[i,j]==1: data[i,j] is valid
357 mask[i,j]==0: data[i,j] is a missing data point
358
359 Missing data points are ignored by the clustering algorithm. By default, mask
360 is an array containing 1's everywhere.
361
362 o weight:
363
364 The weight argument is used to put different weights on different data point.
365 For example, when clustering genes based on their gene expression profile, we
366 may want to attach a bigger weight to some microarrays compared to others. By
367 default, the weight argument contains equal weights of 1.0 for all data points.
368 Note that for row-wise clustering, the weight argument is a 1D vector whose
369 length is equal to the number of columns. For column-wise clustering, the length
370 of this argument is equal to the number of rows.
371
372 o npass:
373
374 Typical implementations of the k-means clustering algorithm rely on a random
375 initialization. Unlike Self-Organizing Maps, however, the k-means algorithm has
376 a clearly defined goal, which is to minimize the within-cluster sum of
377 distances. Different k-means clustering solutions (based on different initial
378 clusterings) can therefore be compared to each other directly. In order to
379 increase the chance of finding the optimal k-means clustering solution, the
380 k-means routine in Bio.Cluster automatically repeats the algorithm npass times,
381 each time starting from a different initial random clustering. The best
382 clustering solution, as well as in how many of the npass attempts it was found,
383 is returned to the user. For more information, see the output variable nfound
384 below.
385
386
387 Return values
388 -------------
389
390 The cluster routine in kMeans.py returns two values:
391
392 o centroids
393 o clusters
394
395 The kcluster routine in Bio.Cluster returns four values:
396
397 o clusterid
398 o centroids
399 o error
400 o nfound
401
402
403 o centroids:
404
405 The centroids return value contains the centroids of the k clusters that were
406 found, and corresponds to the centroids return value from Bio.Cluster's
407 kcluster routine.
408
409 o clusters:
410
411 The clusters return value contains the number of the cluster to which each
412 vector was assigned. The corresponding return value in Bio.Cluster's kcluster
413 is clusterid.
414
415 o error:
416
417 The error return value from Bio.Cluster's kcluster is the within-cluster sum of
418 distances for the optimal clustering solution that was found. This value can be
419 used to compare different clustering solutions to each other.
420
421 o nfound:
422
423 The nfound return value from Bio.Cluster's kcluster shows in how many of the
424 npass runs the optimal clustering solution was found. Accordingly, nfound is at
425 least 1 and at most equal to npass. A large value for nfound is an indication
426 that the clustering solution that was found is optimal. On the other hand, if
427 nfound is equal to 1, it is very well possible that a better clustering solution
428 exists than the one found by kcluster.
Something went wrong with that request. Please try again.