Skip to content

inter-action/scikit-learn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About this Repo

考虑到好久没有使用过Python了, 最近一直对机器学习很有兴趣, 所以python这一关是我是要重新开始了。 这个Repo会作为我当前python状态的锚点和跳板。

这个项目是如何开始的: 我的另一个项目document-indexer, 我想加下计算文档相似度&cluster的功能。所以搜索了些链接。就是这个 Machine Learning :: Cosine Similarity for Vector Space Models (Part III) 但这个博文是3部曲的,(BTW, this guy's blog has very high quality!),诺,看了第一篇, 就手痒想搞下。这个项目就诞生了。

这个项目的未来方向: 这个项目未来会存放任何和sklearn相关机器学习的代码。会当做以后的练习和参考。

Project Settings Up

Preliminaries:

install virtualenv, pip, ipython !

with Makefile:

make run-main       # run this project.

with vscode:

https://code.visualstudio.com/docs/languages/python

with ipython

先activate virtualenv, then >ipython

ctrl + l  #clear screen

# --- auto reload module:
# http://stackoverflow.com/questions/5364050/reloading-submodules-in-ipython

%load_ext autoreload
%autoreload 2

# add auto reload permanently
ipython profile create

# appending following content to `~/.ipython/profile_default/ipython_config.py`
c.InteractiveShellApp.extensions = ['autoreload']     
c.InteractiveShellApp.exec_lines = ['%autoreload 2']

Python3 language:

dir()               # print out what available in this module
import builtins     # see what inside builtins

Exceptions - https://docs.python.org/3/tutorial/errors.html

import:

from driver import chrome_driver as mydriver # absolute import

from ..driver import chrome_driver as mydriver # relative import

    - driver
        - chrome_driver
    - crawler
        - web_crawler.py # 这个file引用chrome_driver
    main.py # 这种情况下relative import是非法的.
        http://stackoverflow.com/questions/33837717/systemerror-parent-module-not-loaded-cannot-perform-relative-import
            Note that relative imports are based on the name of the current module. 
            Since the name of the main module is always "__main__", modules intended for use as 
            the main module of a Python application must always use absolute imports.

string format:

https://pyformat.info/

is vs ==

http://stackoverflow.com/questions/2988017/string-comparison-in-python-is-vs

You use == when comparing values and is when comparing identities.

Links

todos:

done: 
    test using ipython with virtualenv activated

pending:
    add a linter
    add flask, add restful webservices, 如果是练习的目的话,这个应该不需要了,再看
    add docker, then config Makefile again.
    ! learning more about ipython
    string format - http://www.python-course.eu/python3_formatted_output.php
    find a better way to handle __pycache__, generate it in non-src folder
    add python unit test

About

exercise ML using scikit-learn

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published