From 4172604580874f72b664bfd29fddb2ec1bab65eb Mon Sep 17 00:00:00 2001 From: sdpython Date: Wed, 22 Oct 2014 02:29:38 +0200 Subject: [PATCH] =?UTF-8?q?s=C3=A9ance=206=20:=20initiation=20HDFS?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- _doc/notebooks/td3a/hdfspath.png | Bin 0 -> 5082 bytes _doc/notebooks/td3a/servercred.png | Bin 0 -> 5101 bytes .../td3a/td3a_cenonce_session6.ipynb | 539 +++++++++++++++++- 3 files changed, 526 insertions(+), 13 deletions(-) create mode 100644 _doc/notebooks/td3a/hdfspath.png create mode 100644 _doc/notebooks/td3a/servercred.png diff --git a/_doc/notebooks/td3a/hdfspath.png b/_doc/notebooks/td3a/hdfspath.png new file mode 100644 index 0000000000000000000000000000000000000000..3ade615f821c9344cf37b7b0a7c357e0f62be834 GIT binary patch literal 5082 zcmai2WmHsc*FGR3EipXO@kmSe(9-45B^^pAAPhZ}N+?~D1JVcl4mAY}1polldb*mX0C0nW@Lfzs zOju7-D(@3sL{L*5b>Pz|+a>{!xZi(3FxI6~UO3+*VDe|WHc$Yd>G|^z_4~ea0{{kJ zJ(Ar3f9Nef%^n=U`a)LNxsnmJ#axftQ2;13HQa|g*Ku<6Qi|^d|f!z<-P{7QM~%?2YX< z*1WniOIJ&wTU^58rO%P&C#ehMV+t?BJ{@a&=9o(J*FocS>|bq+gPGAvrO$0}qRY5h z=0OU%pVFLXlX@YZchcB%v&x) z_bp?IxLI=JjCMqAFeh8aS~Uy_x)n1OB3x0TtyNxZ3HmuhR;?F{fsnom%{-2YuB>pE zV5HX|#SCtwb*Ly8YNFg6p|Ini(GD7;Z?nAxSf|fBenF2NPADOC<)>;B^R-cZ45`uoc$}VcI-o?MXPukW=fZXg+@5Lmqjz|~wkBU{bBmBMqkXVb!SfjC* z9^ZD;=qwMe+2rs2DDJsVgBRpX_`VvQ7cfSsKszUyug#;wpl4h1vu@;T45e=KLi*;O zkNxKYYQS2vAh|Dqlsov4LbjJKFk}R|$BYP>jCrGd=|qEhU$AyF`zc?{P~hu>Zvn_X z%qEsfO~@x_Dut@h%-z|Nj@qyo7TD0%mG3_$+cNxQww|{yiqKrwgGv}#b6GCSXYkRT zE8_cV;w|{;kpp6biza4XW?q%ghM+7RY&kK}H|Z9BscDj8p{0+gidiwjA?3RDzeklZ z#Fhw%<_@@mXWmCS(J>5j|IRhR$rXS3PEQTIs}b)(PSPFi!l_LA$z0Xd#6P|B6GU4< zyc2m@0NR?uG%L)2sbpKL%I=KTiLI^B4WQB)!8FEhsYcNZ z#&|dxt0cc)fA&Mjn>6~lChP&VTm~+vG!A7_^iw7-Y}3aw!Dwt0$s}-t1BqP|)k2XO+8 zCb{&ytjj@F8=?}i9c3Q_`p>T|h%M#DVqSOv0R0@_jziu*999pmXW>a1c$hv3$?C!U z;=RDu+rWZ=oykyxG~8T$`fPo;)IfQr<6X-ra$LN;gN%%nvC`*r8?N3|LyFpyS#WHU zC6i(-RL=#lf8ofnB+VG5fgb*7qZY3wEw13t6qc4;W%!;BA+Pa7oR0A`I5e`GCoE+5 zHK}$OOy6 z3WQO4CoAODCmJ>5ip2-m@;eh<*4;;Su78{c=i6-gl614z4TT)rH-}QB%%)qb8=;cs z)E_9*CyDb-GU`Ayh%N6WE0i#co^!TC{vAecEnlpgaAZM`N80}QnETV^01HW~cmCBq z9@R<)MpfF}uTaxqn*b^f7(iOeV_(f;=f4nYu9cml>xy-&J+|p3x={k{_R?<=J%AGC zTutmFy}WPdCvw-Wou`ZE%Zp;t1Hy|JBkS6uW|D(vQ%B}&;%OS+9lH$&9NnsP1ZtZw z@lF%6$uHO(p3irx<2Py>JJ;Q6iBss4JMR7Sth2eY=S@+{s3SU}w6Z}u3e%R3%X3u| zvsXpc=ljWncJOf=3 z$}mO##K7^pAo3eB-sw0M6Zty|{O30xnHu5z%R;@XP46-M^$$Ymp8a9gRK2(%VJka8 z^}Kc!orM_-+!u!DK42EpfEh{Vli-)E8*tJ2k+Wyp!Z~ISRv63reRV=SL)MzwSWD>_ z6#4J^sb;IQk;p8@d+icUXl}qO1v6J3R%D>f-t_(GhAWw7ya!`DSGLmV*Or4Hy{qG+ zngKKazGY{WOw~2M0ss4jjHKP}WQJ3v_o`9=7UXT6r|*BX#CSLMd&Z~MlzSQ=wrV0O z!c^XDt*7}BO;(7?HqR5-f%Cy69?NyhtTTDPrFg^R6&B>3AYmWSZW$*a`NE2`3@RBG z>%jU0Xv*So|7w?=R1U*R+q`X@-kGlSxP^0(W7N{GgkcIT;pTb{^q7G8?MX0ooV=Qd zGIIV=MW4lpgG93I1gT^Lq};7#7PATDPp?iS|J5l_h;n*$W6Jam+?+zG3CSf{+tKjL zN-l*o?N{!{J)79xyJeTxw()mg*xUrVEbN&5yz!uXDinB=%JGh~ zKz5BX)u0c(%vVY-7lmry>z$6s{gbEz1I=RBg}HJW^oe>AM$6qtVCWx)AAopH>h*0eA^Rw{nfowo#34nKXkjr3D)EL6Yf}U)}n2Dq*aJqYmI7=R*B45LgsI!5-#5JY<9T6m92IhcYsI!g;~@z*y@fLx?=c-=6c+!%LX8TB z=(SF_Rj;p3`y*XeQB|p`;lGr^LoO11BTr9Hkz>osovW_(G9wRND0V9M&#w^?X%DVH zXI_sOT}_N}ULPD!ht{bjCGU%#kR|!P83WH>A^qdUGy2bIQkTCtwqLHjzum^cF2_3R zxO3urb&A$fxwspya|2pXIedbelYuN>>~Z>XTz7<>=X`M3sI{J1d6=7zek0z*sLHMK z=vsO?iLGI<)v*{?C94vpy}BXaGKqj1wr5^0?n*9lCfuZzvBrWvzD%W2MJoG@xn73& zFKc}4U%VX$OI;aqmrhA>^&SEpy$LQB{7Y@rsSq_Sv)^nTzLFA&n^#G0nc{fh#j)GD z+%|X^-v47mUc7zw+Uxx7_3!Nl;>gQ=qfMh%X2KgI_y#bVA%L|Sn&1*v<{5%8VmLOM z*;rLQT){kn!O&5hhZxScZ}IX1OepS8Ycycx2=S1XKaV@7Be$m3@0;1$AbuNf( zUQ`nRY3n~5w(^Z0eP%&HlZJFJjyp7yVFI)v77mz{kD`w{!mw_bR_DPw?_y?)5&h{V z;&E+@BX&Aww^0a&kIXdyGLqd5XO&9e#tW|(d*M=Qa@D&&9LNcO6OwQ7_)Wjkpii2q zy4KhoO&BPfHn%G9HmtvQg)3JKH1uZlL{QPM4ww8CwX_iB7w>b1)!TR|u@3U2xWdek zukMT$bamOkH1!opJe@Uq?e<4WgLvpqj+VurCexUBk7Q7sSG3hx0wy+d92%!zKSy(hM@y;`Ave!^slL7_#%&387~se2M`29>TgSR+jhW&K!tdn|}_c2c$*kKjIdHH}U|$nATt3soQK z!lzttcbAo?OdFf@(`yPdj%Q^V-sNB ztW*)Ij_y&t(A!znK~6ny%rvqSats?~vf#1`7M1XOONvnHSm)VkF!|ECAg+?#B!4(L zIc-#s`$Il!Poj?w&iKUZB`7GSo&u$2mz*(=;mSwm=`Pd7QS^NJ0yUMX#Usdd!ug<{ zLsN*c={ZsojIA{yxLj$cou%w9O3tt(`+O+qyUO4485fBO)}gmer@&2y+jToL8q2h? ztX-ywzRfUL)dANz$ilW+PCKL(>b1;a@)GRjE73}MBsiarX6+GeYWnazdsMJx87<}e zg}$Xfe$>;ztCLNlNW~ykvf;eaU!hy~9M7+A^SU)B#-a{It*?az@k%2>rc7`TGn>%Y zchHXUG^?JDrkiVG&09h>R z8sOAEJ;XH(8{zI>e=jZ-2z)!_Rmjk~%Bx~O{g*wGtn4ShT=e%QAF)B!7kg9_s3WoE zJbLI(%5ryIMdu-f4pXJ_JdH434!_LT9rurH!(J$U-2Hgq_*5|cxk&d@*2O~CCadyV zmKI*Xp>-BsiZ7oJRF;ayY4w=c|K3E*)90C-2&}D&P=%=YPCx!|Md!kPj39gvWHx5hed_XeBrb8*oTj!=}o@Q$AP_t{}%Wi;RbBXEv~)oN{+=)1()_+U2N_qn?V)-!g4icvB!9tf*ey z;nnZ`d4eonpPhl&aK7zf=d~UU)|IerDei|cMz5>J@$W>N6>ickmOQq!S?~GO!X8r~ zk;8I{|N2P7HO308a8{9S-jbO2+kri;R4G$X+L}V@+im1mxtcvH%bfW~eU?ZO0k*i| zDBCYEtXJ>%q8a~qS#;`KbcyS+1UMD+g_;Ebz9^rx(=5(YYwSi77ovO?e|Bt*RHm8*zR8(sOLIQA;HvFtHBx$DVAkMkjxD|I(W}wf^ZFPf2k^dM*7ejYXS5AQXxCRxNH2HFxixM`av2|0qcv&~cqWw~P@Td9L3abk~uOM5$cN%ysIzTAs0mt?JRFp;&6C3eb0 z4`u1A@PWOl4L;>%|%E-{5uHp!4=5=kC%%U xaWUJ!|N5_l2y)Z^r&Ikm`+tS4qB~+&oPJFb`v@4qhMHAUxpoU zKL7xg8QKefT?W-xM7#RO0>G*!^`p^(d58jlwHMu-4tV1bq9HbQdpM%;?MD5`2-I2T zz1`85(ihg=dfE0DkIi~^mK!_w&TKV1zxF5nFE{S(?J(L(xz6b=A- z=Ui8kSmT&=(~88SuLWWhGE^PKG>SDxBb>lZjX zKOpH?^)cCzAboB+fT)R&;85+|C1<3H{u_kp<^sjEpw01(t%C{J?~*UPq&MPEvp{%a zSZW^z3tKR98_Bnza?TS&R16f(D-{+fSoC%jEU?&gzDA-ep8-Y41z}Ih%x4R@aysLr zv;nNjQxSK8Z%UnSejS|iVCT-um+G=-I7p(t>LJa9GG$@E59^11?ySXoSf9vx`IJI7 z+*|_xV9l>Se>U+5Yc{%Q6AZJa-tJlEG`e&n2jf^*L`;9X@=WSXLB}zhhTc1Acn|dZ zfl|L9{G4<}#f()$5LcGv2zsd$Bd`>SuIgRo0+-dw8cJ4Bk7<@WwP!mzvgU0uuUhI4 zw-H_$C>J`EBcj(O=C)SY)Y>0XvAGh?3s9=S@z9Vi> zCE;$iHKh-{!t|>yYT_TLPLs3yOi68a*^3l@9$?OnxumB+a6#mUA(P1nELYkp@Ff$*-yALIU{xS!j@9i7= zr{#zihuP?T$HF27NyjPct*1cX`s;thq^O%Uac-fC09S>#I|`(~&K=QT)4V3_V2=Lx z4_DpV64JcZrH67V_sLgY6SR4UBJA)SP$feuI>RxBF>P@bjOfR^f5|Vf*ZD|;AdGEj z@B8E%id3pZvuN1yV%PS&&y8|IkMG!Z(g02r->iKoE=G4)7Yrv3Xm+M&^ra_{RnxzU zxi@%Jw!zcg)7x@}Uk$SgHf7(L^0`w)w*`r^DfR+|El-vE6lwSsOLFTmMn(GBao?R( zz5n@}SiNV?SF^6Z&-0Wpr!+!aPTqYV&L|Sr&038#Hw1^F>;&s^;Y*^RR#_-=*b^n+ z>(s6hBy+Ltugy%-nv?aFx1L%>*p`GIPMUSjH{Qa z=}BYN13jK=n=LE-JYEev5Ea5^%oP_kErxuF4g*m5V}hiFIhkrt3Mssi)EeJAv37e~ zP&Q{v6~$ztJ_&JNIF#jW-6KQP&h~2lZSt2f74lZZ;M*_{d{dV^Aig+QEOf$>C!f{Y zAinGV0#RGGkW}uMm^<7p1nH{J><`S~Y)IzJGgB$8c6hbonP;@QT0rCbOaip&9FS@N zfyQc>I)!xHA)bYqzNz+XWurZ(yX?E=03{n ztAOmn)m&@|G#V8W?j#M`2^sVE-;hZlACYS8_q}x(C2+Bnp%*MxJT1|Jfc3~t=A+Ydmp_$am`FMjyC-h)|a4^CKw&by?so*k5# z?bxBQt~aN=$ty2}pto-B8te6>2wKFEdRwrWAySwc0MVjF7o6%wgy9lF#WxASi_2;v zJ+_0uU7i0yU+IE})&;5`L?9Fxq<;iPP?0sTUuMkjP&4B_Z*2=4S2J7)nYuyxKLg

{ZviciiK#;Z?jV z+Z6XEM~JbKIhU*@&qO{pKDh8Yk{|EI_4%vyGeiL{wH$tHX>`W7EddcV)ug7V-EFn| zV0MXSAie^A^Y`Xri@Kj{$|CYY1M8!z>I}RjFXmbxiTt)Iq6v@2de^ImYlw?99j))1 z(@WMWhhJG5d=$cOv#C0my`WRObunzcA(m*1Dk1&e&4*a5*~n3M)CJxgX@w4W>SoZq zZzFIQ%n~ocU%If-09fSq&n^9#Lywu-y1IrG`V)m%DK+BRCxq4LJ}|Y;+qaIve+0Ta z1_`J;=pW4SX0o*4rd_Efv-GYd^-@r-4H0%qn~Czt+!pr~=lCB!R9D;N_;-oS^U%I0 zFPrv-vPsaq7nf+~_Dw)jf~<_U9x+OerhFZKw5p~ixT@7*bB_3CZ+-IJ$3h=9d1Pzg3x&^$O z_&%iv(J)YfB$p-RPv;a&kG2Gm1y>t8jG(qkF)?Lsn3GM+T=GG4=@qM{@M5`}c< zd)bAIeh0Mu>}tCF6wFC_;1CLCAB0jJeaD zoQmFee2fIvF{)FnWd|p!(&|&V3MRMZ8l%S%gY~x^H1Pji=GE%>!_UG?V_oaO>e3$a zxCdo5Cl^^WwwAgaHxw7Z$vm5nzfnCp@_;vFIc3JxlOY#2S7u7V z7Mna0cR+Q~TrtHbP-k9;d&N^l8Tu_V_dfrmqX0jR%U5t-{V2S28S*tOJb($ZYv@U(ejiI%!1?`?6c6_{#ioGo4DGo%9Q{q>d zR$-F=rsN*(pkMK{%6TO+BX%m=Kb=$Drl+>e1$Y{6%Obq^imTOSmroeYY&bX1O1s2; z3U4s-z}jgOqK7s<_szLYHv)OOmh#)&fp|eT){$SrrGM0(ekC*|k3CHnb&c-BPA49w z>2BBxg-lhHt2T&=F~jxPk0+SL@ci*^Z~t(Fwe_#}vyeDazzG zFLqStdvXUxKWx1;#}vPu4Zn3uE8In~H%)}EOB+y{rWwNv7|MjW&_zlwO>1&`Se?Onp~ z{U0`Q>8eX{<6{LAD-s-MKU!=Vp)x~~u8AvJ<5G=b4mLQdK}z8-a{GxhgkPge<@CK= zb12fF*qgmGXsh?uo+}G4OC0-YxYS{^h5Q?AS%d>pD0SZqA^1-%kFBMAhB6}JcTrPiAzHxQWel2y zyfo5hvD3OLi;G&4_E;msc-(bDd#jX;($r5$kBjrtqg2cb1 zE~m3vylmXtE57xP)YaqgCWwkThB%8;MXN2{<{sO`D%4hNXGpR*8ss=I1fP^JPml>xlyv}bCsK(7QB1)s&pA2uL{c00zQ7pP#$TyPc9(DL@sEYL;FeH z<-3ezdZK#I8VaEgjDuJ0vSSQm1&R((ZSZ#; zGQG~ZK-tTS2M`f_0ULYqBM5(XGw{X_V| z+F4i`-I_ZVbH0>^GYg}SbwCjJsUh|Y5d?A)>OL4+^L;4Lrs^xpOj;xx2_5U+Ny65G>DZ^8=3_=k`iFMhk8i; zF4wi>$*4aKBaa>C>Lyy}oj*Tidf@lY(_CzBqFK6Hso`4-`_t7VyFp$Lt4Y429@^5) zBB-0&{yzr(1%2F+_$4XiPZkxD9musk=tNeRy}DUzd{WgJkFWaU5|n!aK>0A1>{zLJ~Uh|yP526vs2yrWU5Enm1nVNN;Ge3 zFgvQI&IeD!ctC2bhHidKx5};H*1~jIC)e-J2H*l|>uA5N`X&*0ZB1gscaWvpGmD

Partie 1 : manipulation de fichiers" + "

Partie 1 : manipulation de fichiers

\n", + "\n", + "Avant de commencer \u00e0 d\u00e9placer des fichiers, il faut comprendre qu'il y a trois emplacements :\n", + "\n", + "* l'ordinateur local (celui dont vous vous servez)\n", + "* la machine distante ou [passerelle](http://fr.wikipedia.org/wiki/Passerelle_%28informatique%29)\n", + "* le cluster\n", + "\n", + "Les fichiers vont transiter sans cesse par cette passerelle. La passerelle est connect\u00e9e \u00e0 l'ordinateur local via [SSH](http://fr.wikipedia.org/wiki/Secure_Shell)." ] }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "from IPython.core.display import Image\n", + "Image(\"hdfspath.png\")" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "metadata": {}, + "output_type": "pyout", + "png": "iVBORw0KGgoAAAANSUhEUgAAAxMAAACgCAIAAAD4jqZBAAAAAXNSR0IArs4c6QAAAARnQU1BAACx\njwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAABNvSURBVHhe7d2huhy5lcBxO4+wPMwO2G9Z2IQs\n9aC8QdgMtMmyfYgZOGGhi5ZkTEPGbFm+gHjY8n0Fr7oly7KqSqUqHUlH0v9HfG/d7irp1DnSud09\nd15++vTpBQAAADL8xv0LAACAM3ROAAAAueicAAAActE5AQAA5KJzAgAAyEXnBAAAkIvOCQAAIBed\nEwAAQC46JwAAgFx0TgAAALnonAAAAHLROQEAAOSicwIAAMj18tOnT+7Lyl6+fOm+6qrZfLd8BDqO\nAQAAlFiucxJxL2h0TgAAjI536+4wPZDlvr/o9hMBAEBfHV5z6viKi6qWhVeeAAAYzlqdUznB3ovO\nCQCA4fBu3TWm3bHc9wAAYCW85tRa+KoVHRhwz+mrvxQXgEronFqjcwKOnPZD+SguAJXQObVG54QF\nCbZEmSguAJXQOXVAKLCIqg3TbvlQXNAgJ/NJ0XHROXVAKLCC221TSV1QXOjlasKTouOic+ogLDCi\ngcmk94/aCc86g3qu9kZpl1I0J7FzhkddiOjQORncPKKBWe0u3+3XGcoKN+Q0H/nSSXhpF5AdmJSj\nYXcfbe3y5+85AajCLF6eOwToYLb2Xe7HZVzSk/bz6vOak0FWhQEhGpiGT+wuWd336lBFqhOKSKXW\npS0gfy67p2ocikqXy1e7/Nt1TsalRJke0cCUfGJ3yWrKCtbtzbvLnphzUT25fTrsXkO9FM8SvFsH\nAFiR2V+33M/00Ty2SDjUsIuaRtPOafpoXjJQGQCjoKwQMSlxxD1iQGNtoPNt97zmBACYls5t24xq\nvn5iHa07p7DNJ28AAIuw3ZLlDl000OtkQ7+kd6rpJ8StMGnmDu4pH4rF44CZdM9qygpGl40mvGi+\nS8PrMq9IfondC0i52pHp8G5dr5sNAFhEm43GdAYhd/QKNsQRdXjNyfAZtnjShJVG/WAO3aub5QWW\nbCaEy3WJ8sF0yfDd6ecMQCpul9SODJ1TZ2FWEQ1MoHt1s7zAmnLPNrrMa9eyJcZ/WwcAwE2mewi5\no5garzn1F/4CQUAwuu7VzfICS/y1GSUZJT6vTBSUR+ekAgHBNLonM9UEoCrerdOl1y8TAAAgB50T\nAABALjonFXhbAQCAIXT+nJNB02ARE8wh+phR7Tegt8USDQCoqjDDydIR9e+cDFLHCGNCQDCuqHEp\n3FdObYslGgAQqp2QV93IUjK8uz7v1kX3W1sqAwCmYbYYzx1SQ+GQBNmYG+77WfR5zcmKokn7HAaE\naGBQPo2r5nDiKm0GgIFEe00NN5LtdqKG01Ge5DWGeuluVopPz87JGCgDGogSgoBgRD6NeyVw9wFA\nm929tnt63E7UcDpHz92dcj1KhrFV6S537pyMnCRYxDbJFg8IRuTTuFf2dh8AtFGYEiUb33an6C4x\nhb6jrXTH+/9VAlY3z4SCaAAAEtgm7F6Z4B5XTf/XnIyS7ntKBATj8tnbK3W7DwDaKFxRy7M0nFRC\n1fnmzKJLPda+qLrOydAwpL4ICAYVpm6vvO2yUkMzDWkZKc/SGpMKz3lJYgBdgl8e3jQVf0PczK1Z\nQIdANDA6chh6kI2nTKsRtjhDazARFa85eX7CJHp474kGRqGhhFlGsKUtK8rHE+4Rxul5osfLSly9\n6nVPVbrddE56EQ0MR0PSUjjY0pYVIuPxJ7GOThU97JR4iK4OQFCl203npBfRwFjC9bFj0lI42MrZ\nvFsmjEiW5kwqX73py44zU9W7qbRzMlQNrAuR0gKaUZKxFA626m3eV9MsGklhll6aFxUhRW/nZCx+\nm8NokPHQz2ds33RVMgyocqnDaKk8S0+nRiGI09U5WXQMHnsABqIkXakaXHLaeVRFlo5IY+dksPZZ\nxAEDoXPCakq6LvJzXHROqoVlSZnB+/c3/2O/+Nv739svNMgs2/zN5kbOUzLK+dT1VOUwkEPFX8JM\nyF9kp0cosLXdh2DRNo2CHF6B2b9C7uiwtL/mZLACjhKNvivgcL+5psOVno7OX9x9oqazNH/dvJHt\nmWNALyVpj3HtVv24Raq0czKiQC++Dg6xH+j83VHnWlwvVh3nqyFL6ZxGcVQCNE/a7DY9p8ICTJwh\nXaf3Lp0gtSzQOY0hjIbaUOjsnHY1WJ17RYPOyX6x+IrRkmyq0zlpUN6yhAUYVmX5mUtILQt6OyeL\nddAjFAkDNW2R7T5ROBc6J/sFZVJP7XKjeWqpUisTFuC2Kitd9JTUsjBM52QsvhSyJVylv52S2iH8\nTPtuORpSdNYyGfd3gxvonC7Zzflw6yyXU03hFaPHb0d4aXjltbwbohLaOycjcT+WQhwENduHau8B\n0UTonKYpk+FapZLcU5XG4sKcPFKSqznnv+HqkBLl33dlqLEmDNY5GfoHXE+NDMDQ6JwigjUyXO/S\nkmCmbeM8dOcUbViZSnL13hUjhcWSrruOK0MUHKkBDNA5Gem7sg7igAid05ZUmSjsnCZ7MWY3wjXm\nGG2fCpUkanp2bYoxXXS9VoYoMoJXH6NzMnqFXhviALWUJGe4XJaMpHvnNFmftNXs1aZoB+1iNxVF\nSqZ73Z1WnFRJXhLddNnr0jkNhjhALT3J2WWlxlXNXjGNNtEGMrNOpGS6193pAJrV49GNFr8ondNg\niAPUUpWcVAo8tckgMrC+s/NXN44GkPOYrfBZhcQjo/3/W4cjglkFAMBVmdtQ2LiYp2RyTyhWo6Hk\nNafxEAropCozKRN4apNBZGD+JH2lp9BykA3uMp3TeAgFdFKVmZQJPLXJIDIwf5KOTsdfMkiFJTze\nu3UaskQJQrHn1x//YALz8g8//uoOAOO5lsY7j37//ePIw/fv3SFoWTPtjTHc9wsw3c9t7hSaDNM5\n6QxfF2Eolqo9DMpuEjW4CyBiOqlv/+y+fvH3f/JLRKBj2tikNdz3Qp7dRWduKMvgE+IAMJeP//jw\n+Oe7nx+b2i9vXz0P4jPx3uVUjYYJHdE5DWnBHh8A7nk0kE3WTNshbbkfb9iBGe57DGLIzimRiAta\nIBpfPrJx9FGO3Q+EfPkxH3qqz4TZfbXhNgc57rxrOErj8PhHd+zhcdy9Vffnb58/fT7rpFhWEGaO\ni4Q0d/Ykm8OWO4TRjNQ5kWehMBqZFTsms+J/+cjGixcf3v3J7wS7x50P715/+bH5ho/JttKgTh97\nzpP7fl5HafxV9n949+2757tzh06KBW0skrQrGPXdOtvg53NPm8sSRfjrP//++Md+YuPTp48/fPM8\nfHj8i29++Pj4yc/fPb/jY7I1zVpiCuym8fv/tn2QTf+vk//NT/7Rzx//8vbVebFAjI3xLvcIjG+V\nzznZ/slyhzCSx/sOj9+3X7395asPvB4dNxvOX+z3b/743EU+/CN8RwNzm6bM99PYNU7f/PAfbx7/\nvnr7l6xe6LBY1uFamJrclTC1FT8hPmX/NN+MHLPKu1+g7ac2Pn9G4+g4utK2c0y9k/3b7zLbH4oF\nEDVY5xStg48OP8k9bs9j/Xhy30Otx9sPhnuP4ctnNI6OY23LFPWVd6ApFkDOYJ3T1TXxuVh84Y5+\n7dk+jbrUHk1qHu+/37zK9HzH4ug4ML3X/2r7n//667MCfv3xT+lPiFMs6M3uszncE3QbqXOKYnqj\naXi2TxP2T9bo4z/04d1re3vcfx/03R+fn+44PA483VgixvDq7X/a9sdWwOuT/7LugWLBIFyeHnOP\n62rFzznZ/slw3wf03Jh84USGG/y5Nz99/V8Cfffzp58ea/7RcWABX6X/Nz98/LoYYhQLJqJhm365\n20AoFEZKcMxXb4DOcFUKDpDJZ6CG9FM1GABGTlVe3Y6tLmW+eudk3bthyrFtoBlVzUrVtQLQRlX1\nHREZZM5O3SYIY3ROzZbCnBsziiHuLOagau2OqphCwKzyNyzBKji6aPoS4bNKBpM/5bTCgAzQOelZ\nB6XuWRv67yym4UtDSdZJLdOAiPTecTtFRbak/KuXzEK8JAvnXjgG7Z2TeLhnFaURsUJLPv30JJ7C\nIUGPwn1X1u0UHWgW4VBrlOTVUBSOgc5pBtukIVZoyWegqsTTOSp0cXVnbakkP6N5RaeqOmt7rfwq\nCwfTqyTzR5umt3NKJwS8bW0QKzQmtR7J0jkqRP7vt6/dV4F/+V/hv9S5XScbi5JQMDmjqWWesCQg\n4SXyJxJesXzW90iFXWnntL2pOsfZ0VHeEyi0J7UeydKwUo9it33pqFnn1CsxZEumY6rnT0RDPUqF\nXV3nFAbX0jbC7rYh8ojV3I62N/Ft5iqp9Uic1GLtI9891FK0tUqRqp2ThiwVL5mWEwyv5eVcdPeJ\nXRSGSHvnpG143SUyj1hNLL3P0TkllI8tDL7Czkl5D5SpdmDDlVNDloqXTLQ1lJw2OlWmnCveO3MN\nhWHX1TkJ3vsp7aYdUZrMjY2QzikhrJobw4tuR+1Q92qDFHaEsgrTQFyNktndIJrJmUjfEYYKw663\nc1I1sO62CUd8JnNpy9S2z9XYBgTdW1i2d6Q87L0aoyPTN0yhlvvLdsU+IjiS/IuKWHkPUtQ5tUzr\nsezWQ98QhRvAUotvuXt7p/Ig+xRVW7k3Rni7c2rfHlGDOWpvMbsL9SnZkdwbwy61tayBls4put/c\nM2+3ErrHh85pS2q/HDGePkuPMrP7pnI6wkj77idCWYm7l4RVsdMNSkXnFCU0yRRSGJxoU1l5iRfc\nX4cO42lfUmnTyi+H0xFGqnZOdEW9VMrDBLazKfXvnBR2BnqEwdEQme12MuUewK551WlfUmnHyi+K\n0xFGRHKADkmbSnnosX8tomfntJvEUuMxJx89iaP49J1O1WZiJmtullf7klOZO1z+5a6O8CjhaYYA\ndGsvtitj5kjCJx49RXwdb+92fCqhc0pbfEPVX3H3Rrib9jRPwOI6dE7bnsDIH0b49O2zzE/tQf/F\noKIodZ/Lap0Tu+MlPl3VFl3JCKPkJzeAxbVuL8obgvAM0dP1L9+ZEnPsjt/CsaW/9ApHGKY92Q4s\n7jfu3ybChuA2s/B57tBn9sj2+CmRgS3CbBvbnWO1V6SwGrolAF6715yi7qTZdT0zgO1Fw1Fd+um2\n2RKZUfco5eMtDHg+b9VmrP4RYh0DrfPY1eg1p+6JYgcQDSP8tnvumsFEw1OOVgkAMtkV3nKHPnNH\nh1r/C7kJn3GP1qd657Sdf5cexV40urQ/GB1XQueoQjRPAHDEbn+WO7S8S9GwD/bcUQUqdk67U+3Y\nDexeOj0e81PPHfrMHQ24H9yiKlDAZHx9UVao4bnX7XA/PvDcN0jIC1xYP3NHe6j1OafdWWVeyzx3\nqXyKYkUtYTg+h3Vmr/LhoYHdLcnL35vcV7ccXaU8P1tm+GkQ0mPIH+qNaDeYvlWlR9mdcOaFWmaA\nElG41pk4pqG8bBdcVRDZ3ZUayEm58vzsNbtd6Vncm2y9Cd6LuXzntJ3h1UuYMxw9JTx5zmMs8TkK\nikareajAEZ/GOhNY+fDQwHZfkFWSWuX5WXt2l6RnIVKMgvO9NwzJzmk7GcGTW+Eljk6+G9PESMzj\nxceZKRpqr2EAhXwm68xh5cNDG+F66zMhWoRPiafQ7qguuToFEduhZlZZ5sOuuh2EmzGXGv3uuGVD\nY4RXOTr5pZFUuouZ+l4dkKI8kyk0WAozwQ/JuDeq8AxWl9llxlbJLSgchsx/W7d752rExZ7Wcoc2\n3I+f3KFj9jE5jxS3DRoAALc9970O29kNQ++AAp3TbtvkvurNppHhvt+T/mklYdC6DAAAoEf5RsBW\n0kxR52S2f81tUxvbCJy68RQA9/hyY18BVHk0EM13Q5GL3umc7IUN931gzbZpNxRHogezmgPAmi7t\nHadkz5bPXDf/0tstr9mwL40z7donxBNXLekAwtOmz5P/SCsa8OnJb8zi6rOuTgHQz2e1wpTWPDY0\npi0ZRLaD8CQanE4kPWDxW5O43P2YZz6z6lTzsyf/kVY07MRT/CML55J2dfzAENqUzz2ax4bGtK3A\nIuMJT6JB5kT6Drvw7me9W3c0Q3Nty30/MjuLqnPRVrQAgGbMFhBxPyjbEfTsJmYk+YO59GBZ5dc9\nec0pvLWhXhMeVxRJAoiZ+PRWmNiax4b2GuTD0b6ZQHLeCNolshFOdU67M+EG3xBFkhhiMg12o3vU\nDgy91N6h7yE/x3Ktc+Lu3kDbhOmlG5QGe9VRWaUHhgU1yMZTZOPoDjsn9nsRhBErSDcoDfaq03WM\n0oPVIBsN8m1u+50T+70IwohFpBuUBnvVUXGlBwYAN2R1Tsag606NJTsnFLvXZe3GrNQ2KHROAMTl\ndk7dXVr4Gow/MZ7t1Vm1MTc6JwDrGKZzGhdLNqZH5wRgHYefEA+N3kiJLJo3gsBijUXQOQFYR1bn\n1FFJ01ZjapfGw2KNRdA5AViH9s4JgH7hbxSqlhQ6JwDisv6/dQCQQF8CYB10TgAAALnonAAAAHLR\nOQEAAOSicwIAAMhF5wQAAJCLzgnAnC799TUAyMTfcwIgoNdfTspsj1joAEihcwIgQKpzqvFCEasc\nAEF0TgAEaH5rjFUOgCA6JwAC+nZOrGMAmqFzAiBAqnNiRQKgHJ0TAABALv4qAQAAQC46JwAAgFx0\nTgAAALnonAAAAHLROQEAAOSicwIAAMhF5wQAAJCLzgkAACAXnRMAAECeFy/+H3G1MfG2e7ZZAAAA\nAElFTkSuQmCC\n", + "prompt_number": 2, + "text": [ + "" + ] + } + ], + "prompt_number": 2 + }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Le cluster [Hadoop](http://hadoop.apache.org/) inclut un syst\u00e8me de fichiers [HDFS](http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html). Celui fonctionne \u00e0 peu pr\u00e8s comme un syst\u00e8me de fichiers linux avec les m\u00eames commandes \u00e0 ceci pr\u00e8s qu'un fichier n'est plus n\u00e9cessaire loclis\u00e9 sur une seule machine mais peut-\u00eatre r\u00e9parti sur plusieurs. Pour \u00e9viter la perte de donn\u00e9es due \u00e0 des machines d\u00e9faillantes, les donn\u00e9es sont [r\u00e9pliqu\u00e9es trois fois](http://www.bigdataplanet.info/2013/10/Hadoop-Tutorial-Part-3-Replication-and-Read-Operations-in-HDFS.html).\n", + "On **uploade** pour le chemin bleu, on **downloade** pour le chemin rouge. La passerelle est sous linux, le cluster [Hadoop](http://hadoop.apache.org/) inclut un syst\u00e8me de fichiers [HDFS](http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html). Celui fonctionne \u00e0 peu pr\u00e8s comme un syst\u00e8me de fichiers linux avec les m\u00eames commandes \u00e0 ceci pr\u00e8s qu'un fichier n'est plus n\u00e9cessaire localis\u00e9 sur une seule machine mais peut-\u00eatre r\u00e9parti sur plusieurs. Pour \u00e9viter la perte de donn\u00e9es due \u00e0 des machines d\u00e9faillantes, les donn\u00e9es sont [r\u00e9pliqu\u00e9es trois fois](http://www.bigdataplanet.info/2013/10/Hadoop-Tutorial-Part-3-Replication-and-Read-Operations-in-HDFS.html). Les op\u00e9rations standard sont disponibles (copy, rename, delete) auxquelles on ajoute deux op\u00e9rations : upload, download. Les commandes sont presque identiques \u00e0 celles de linux mais pr\u00e9c\u00e9d\u00e9es de ``hdfs``.\n", "\n", - "Pour manipuler les donn\u00e9es sur un cluster, il faut d'abord les [uploader](http://fr.wiktionary.org/wiki/uploader) sur ce cluster. Pour les r\u00e9cup\u00e9rer, il faut les downloader." + "Pour manipuler les donn\u00e9es sur un cluster, il faut d'abord les [uploader](http://fr.wiktionary.org/wiki/uploader) sur ce cluster. Pour les r\u00e9cup\u00e9rer, il faut les [downloader](http://fr.wikipedia.org/wiki/T%C3%A9l%C3%A9chargement). Pour faciliter les choses, on va utiliser des [commandes magiques](http://ipython.org/ipython-doc/dev/interactive/magics.html?highlight=command) impl\u00e9ment\u00e9es dans le module [pyensae](http://www.xavierdupre.fr/app/pyensae/helpsphinx/) (>= 0.8). La premi\u00e8re t\u00e2che est d'enregistrer dans l'espace de travail le nom du server, votre alias et votre mot de passe dans le workspace du notebook." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "import pyquickhelper\n", + "params={\"server\":\"\", \"username\":\"\", \"password\":\"\"}\n", + "params = pyquickhelper.open_window_params(params=params,title=\"server + credentials\",help_string=\"renseigner\")\n", + "password = params[\"password\"]\n", + "server = params[\"server\"]\n", + "username = params[\"username\"]" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 9 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "from IPython.core.display import Image\n", + "Image(\"servercred.png\")" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "metadata": {}, + "output_type": "pyout", + "png": "iVBORw0KGgoAAAANSUhEUgAAA1IAAADyCAIAAADIsGMjAAAAAXNSR0IArs4c6QAAAARnQU1BAACx\njwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAABOCSURBVHhe7dy/qyVpXsDhGxlsamRivKEwcIOO\n/BOW0ckMTiAXOzMbzDbyIixyE4UZBBs20ssmIp2My7qJkwwdCLsgLQgOBhqZGG3Qft9636p661Sd\nH3Xu6b73Ped5OLxTv946dXp6eD97bvfe/DMAAFfg5jcAAJzVtz/60Ut4lafpyT4AgDPbyq/nepWn\n6ck+AIAz28qv53qVp+nJPgCAM9vKr+d6lafpyT4AgDPbyq/nepWn6ck+AIAz28qv53qVp+ktZN+H\nX91svX7xs9//yd/+54cPHz774rsf/PynN7/6qn6VaQAAF+eHe5WLZrby67le5Wl6+7NvI/sAgGtW\n+m6HctHMVn4d/4rcCvuPHP8qT9Pbn313sg8AuGZ72u5jZF+86s6rt094lafprfgh77fffvvZH/3T\nD345aT7ZB1yJ919//sMvvyk7T/PNl7FYHLhXXPP51+/LDvB8Pn32xSvXXrZ1atWrPE3vYPb91pB9\n8caf/fHfyT7gOj0p+9ZHnOyDF+JZsi9eT2++eJWn6R38IW/6Oe8k+/71T29+/del+f79L2QfcCVk\nH1ynK/m27w/68vvtX/zsz3/y+C8//vGPU/a9+7Ob/34t++CqRZIUY5mkKiqGg7mU8tVffhn/qLIp\nnesvPDh3V23FRQfbqLp5vtHCnZceIKk/6DfdtHJieUq+83CmHB9vEro7xJHxbaob1W8+uaZ+juEK\nYIfyn8tMOb3Snol7Tm3l16pXXXv19gmv8jS9Xdn3kz77Uv/94h9/7w//5LH82b7Ivtx8/atMA65E\nypQ+Pr75Om/Ux7pGyX3UHf78y/7i7dipLjo0d4e4aH8HpbuUG6adr9PW9p13PEC3OTle32nnM09q\nr75m2J7uvf96eJLJw47X1G/W/4IDe6T/WpeU0yuVyTuUi2a28uv417zz5keOf5Wn6S1k32dffJde\ndz/97PXflO0vvhv/Ju8vv7r5n7vxCz/ZB9cmVUjfJsU0asYr6mAJ1WXjTY6cuywu2ntN3Hz7Wfc9\nVTI8wNbx+l7HPfNwOExnbN96UJ0YN+v7AI3Yyq/nepWn6e3Ovulr/LN9P/9pyj7f9sH1ShkSxhKJ\nQpkZMqgOliFlxhNHz51YmrUQUumy+T2md971ALP3X/3MaW8eccl0r7twsDCjnJ88DfCybeXXc73K\n0/RWZl/+/+3zbR9Qp0jKoMUmmWZQKC1THT9+7pK4qO6nbcs3n9551wPMjo/TjnzmtLcQcaHaS/ca\nT01PVDPy3bQfNGMrv57rVZ6mtzr7fvfrv/ydv/qH+lWmAVen75q6byamGZR0MVP/5YgVcxfERYuT\ni+WbT++86wFmx1Og5Wm7puy58zTixr1jZ2Q73xh4abby67le5Wl6q7Mv/Zm/6akyDbgS33zZd8rY\nLCmJxnwZ/pbCNGo6qWZC1S7Hz52Li/Zn0OTm1V/pqO+84wG648Pdu4uGq4565rTXz08TxjOxV07U\n16TtpR/yLv2CAy/cVn4916s8TW9d9n319/81f5VpwJXIddLpEyXJXZTVIbOdKem6el5y7NyTVDfP\n916489IDhOqj5v9flmra4WdOe8PN+lt1p2Pu7ER3l+rEuDleMc4CXrit/HquV3ma3s7s+7f/+L/o\nvGOUaQAAvGDrf8g7e5VpAAC8YAvZBwDA5ZF9AABXQfYBAFwF2QcAcBVkHwDAVZB9AABXQfYBAFyF\nMft+DQDABSmR15tk3/8CAHARZN9legMAvHnz4cOHsjQi+y5V/o0OAJSlEdl3qWQfAGRlaUT2XSrZ\nBwBZWRqRfZdK9gFAVpZGZN9FeNzc3N6/KztZlX3vH+4e3pft8Ml2AeBFKEsjsu8i7My+9w+vbu4e\nosYe7m5ePbz/ZLv5PzMAeAnK0ojsuwh7v+17e3dzc3P3tux9ul0AeBnK0ojsO1VXWo/3t1E6N2Nz\nxdFi89gdeFeuKAcmu3FxuarutrgkHx2vnVy22cThyc1u7+93Z9/bu/Tl2/uHV7nHPtkuALwUZWlE\n9p2qC7zcWim/tqqrz7iq7JJdu4+b29DfrJ/ZXzvePx3s32m8oMu//nDPX+kAgKwsjci+U0V1jaU1\n9FsXY1l3dqvI5rvdTlRf+uIwbfeH0pVjIfb3r960v7AzeZhM9gFAVpZGZN+p6tLqG21stUmUpcPj\nD2rr3XxZqr4u+DaPw7x00TBD9gHA6crSiOw7VZRWn1pDog31lY5MOmwSadVu2qh+vLvZdAWYpPuX\n7kt3y5t13o0XpPOyDwB2KEsjsu9UqcC6v12RlD7LARbSma7DUptlfRcW9Yy+2Ort0N+sSro6+8a7\n7f0rHQBw7crSiOw71bTAXh7ZBwBZWRqRfaeSfQDQhrI0IvtOJfsAoA1laUT2XSrZBwBZWRqRfZeq\n/E4HgKtXlkZkHwDAlZB9AABXQfZdpjcAQPcn3cvSiOy7VPk3OgBQlkZk36WSfQCQlaUR2XepZB8A\nZGVpRPZdKtkHAFlZGpF9l0r2AUBWlkZk36WSfQCQlaUR2XepZB8AZGVpRPZ9BO/ub2+yzePigcfN\nze1mE8dub2/7a0I6fP8uNnZeP157kOwDgKwsjci+84tIm+TZUHNps9uKI/2h6uJ+c+/1R5N9AJCV\npRHZd37dd3VjpY1f3XVS2VVhN8Ren3iHrj+W7AOArCyNyL6PJMdbara0lcNuMMm4OB87eSz7e68/\nkuwDgKwsjci+j6cvuYi2rWqbZly6brO5HVLv0PXHkX0AkJWlEdl3finbsr7k0hd4vXRsK+O60/UX\nfAeuP4rsA4CsLI3Ivksl+wAgK0sjsu9SyT4AyMrSiOy7VLIPALKyNCL7LpXsA4CsLI3IvktVfqcD\nwNUrSyOyDwDgSsg+AICrIPsu0xsAoPuT7mVpRPZdqvwbHQAoSyOy71LJPgDIytKI7LtUsg8AsrI0\nIvsulewDgKwsjci+SyX7ACArSyOy71LJPgDIytKI7LtUsg8AsrI0IvtO8ri5ub1/t7X97v72Jts8\ndme2D6QrN5s4FrvdrPtNPl3fqyi36C57LPeJY8Mthynzt81kHwBkZWlE9p1kSL1qO/456a7qmsdN\nf0V1aOi0tDmZWc3tLpttVlOGK9Nmv5XIPgDIytKI7DtJFVvDdvet2xhe45dwnRRpS7O2tmOzyEcO\nTVl4l0L2AUBWlkZk30l21VjJsFRfaWvXd3g7tscpsZXPHj9lm+wDgKwsjci+k1S1FQE21ljSJ9vs\nxMGGG46l2+etQ1PSxni0JvsAICtLI7LvRCm3OpvNGG1F//VbqrdeOrbYbfV2PyH9xY985OCU+bsU\nsg8AsrI0IvsulewDgKwsjci+SyX7ACArSyOy71LJPgDIytKI7LtUsg8AsrI0IvsuVfmdDgBXryyN\nyD4AgCsh+6B55X/PXr3Xr1+XrXac8Zmf8ePHW5ffi8DLJvugebHufn/13rx5k7un7LfgjM/8jB8/\nv3X5vQi8bLIPmpcX+7JzreruKYdevDM+8zN+fNkHDZF90DzZF2Sf7AMOkn3QPNkXZJ/sAw6SfdA8\n2Rdkn+wDDpJ90DzZF2Sf7AMOkn3QPNkXZJ/sAw6SfdC8T7vYP25ubu/flZ2VnjL3gI/WPW088xG3\n+lgfRPZBQ2QfNE/2Bdkn+4CDZB80T/YF2Sf7gINkHzRvabF/d397k20e036s+UXezxFwX45WObA9\ncelOs3oYr+lndZdtNnG4P5BUc2fvM33CdHqcOcybzRqcJ6F2fJDuvbsH3H7bJ9nxzNWv0uTdb+8f\ny+MNpwc7P37/ibp/2f282ceMu88+b3dZOtodWfitUsg+aIjsg+bNFvu0qs+X506dEdWaX7bmE6sI\neNx0W9WRYpxf3SAd3Lqumju/7aA/Ff+sSqTb3DPrLNkX91/8ILGR9vvHOZuV2df/gi796919q/LY\naVJ1q+2PGcfyofiFDf379Of6CdXUgeyDhsg+aN72Yr+YKN3a3RnW/rxRbc8ndrEwSifriZ3prDjd\n7cwuS+o3quTZcbLIE/s7xUY+sDird4bs2/1BbueddQ4rs298hP7ZRsu3ik80mdTtLH7M/sr0a/2Y\nt4fJ9VtPHiOTfdAQ2QfNO5x945G9a/m+iYPZwj+9Jk53Owt9cNwbxVaZmLcm+9sPM/qY2dd13853\nPt2p2bfwC/HU7CuX5sKO7c3jOHfxeUayDxoi+6B5s25IC/tkcR4W6/HM4lo+m5hOzY9MD3TXlI5I\nN8ib88vCcHB22+FM/Qixnf584NAos1mVM2Rfd/9dH2R4wDPa8czje1efuPrs1fnB7oIsl6ZJ1a1m\nHzNt1j/e3cQvfPm89UevtwvZBw2RfdC8pdZJS3vWrerdoh9SRQ1r/+JavjVxnJqkQwsLf3XN4j0H\n1cHt286fsL+oPEdne9boHNm3/4N05+af6Ql2PvPwLyHiq7xleozur8h0hycfPTl4qx1/paP6POlg\nfcFwqvq3NtkuZB80RPZB857aOhfhPNn3aa155oXeqj3jx5d90BDZB82TfUH2yT7gINkHzZN9QfbJ\nPuAg2QfNk33h0rPvANkHHEP2QfNkX5B9sg84SPZB82RfkH2yDzhI9kHzZF+QfbIPOEj2QfNisY+l\nl9w9ZacRZ3zmZ/z4sg9aIfugebHowvMqvxeBl+05s6/870QAAFb6fqX4X2jPnH0fAABYL0qubB0h\nokv2AQA0KWdfiapDZB8AQKtkHwDAVZB9AABXQfad4u3dzauH92UHAKAFu7Iv+qpsVWQfAECrFrMv\n4ior+7048lKy7+HVTXj18P593rq5e5tP9PvJq4f62N3dXT48XtwZJ7wKZcrifcJwePZt39v+7q8e\nHtLm8GzL77vj/vnowucCAHiaefZ1vTcqRzux+5K+7cuhlZvp7V3uo4cIt77HUjlNyq8kVGwPF9Xb\n6X799bvuk9WzsrhkvHvXbd1O3lt43333X/pcAABPtJV9XeltK+deYPZttVeOrKkxucZ+mkwcvqUL\n/SW775PVAZd0+2V7enb5fffff/65AACebP5tXz6+JZ968dnXHSqbU8v5tSUKsEzfeZ+sDruk2y/b\nobr/jvfde/9djwcA8AS7/krHopeffXFs/AFrbVf2TfotJvcX7bpPtp190/t0P7MtO7ved9/9ZR8A\n8BE0mn1RTRNjWnXVVUlnor3GnX5qTqv5xYOlU9vvO04Z3iPunP5mSLr5nvcNx9x/8kAAAE/Q8rd9\nL9X8u0AAgGcn+86m+rpv358LBAB4FpF9kVLHk30AAE2KjFtL9gEANClqatUo+wAA2hMptXaUfQAA\nTYqaWjXKPgCA9kRKrR1lHwBAk6KmVo2yDwCgPZFSa0fZBwDQpKipVaPsAwBoT6TU2lH2AQA0KWpq\n1Sj7AADaEym1dpR9AABNippaNco+AID2REqtHWUfAECToqZWjbIPAKA9kVJrR9kHANCkqKlVo+wD\nAGhPpNTaUfYBADQpamrVKPsAANoTKbV2lH0AAE2Kmlo1yj4AgPZESq0dZR8AQJOiplaNsg8AoD2R\nUmtH2QcA0KSoqVWj7AMAaE+k1NpR9gEANClqatUo+wAA2hMptXaUfQAATYqaWjXKPgCA9kRKrR1l\nHwBAk6KmVo2yDwCgPZFSa0fZBwDQpKipVaPsAwBoT6TU2lH2AQA0KWpq1Sj7AADaEym1dpR9AABN\nippaNco+AID2REqtHY/PvsfNze39u7LTmR8Z7Dk1kn0AACeLmlo1yj4AgPZESq0dZR8AQJOiplaN\nZ8q+d/e3N9nmcXKq23gsZ7fmyz4AgNNESq0dV2XfXNV2uegeN91WnX39VSkN+8s6sg8A4GRRU6vG\nc3zbN37V10nf99XZN86KnfxlYCb7AABOEym1djxX9tU5Fxazb/s62QcAcLKoqVXjObIvbWydWzo1\nq0PZBwBwmkipteNZsm/6c97ZD3k3m3Jy6ytB2QcAcLKoqVXj8dl3mnksjmQfAMBpIqXWjrIPAKBJ\nUVOrRtkHANCeSKm148fOvn1kHwDAyaKmVo2yDwCgPZFSa0fZBwDQpKipVaPsAwBoT6TU2lH2AQA0\nKWpq1Sj7AADaEym1dpR9AABNippaNco+AID2REqtHWUfAECToqZWjbIPAKA9kVJrR9kHANCkqKlV\no+wDAGhPpNTaUfYBADQpamrVKPsAANoTKbV2lH0AAE2Kmlo1yj4AgPZESq0dZR8AQJOiplaNsg8A\noD2RUmtH2QcA0KSoqVWj7AMAaE+k1NpR9gEANClqatUo+wAA2hMptXaUfQAATYqaWjXKPgCA9kRK\nrR1lHwBAk6KmVo2yDwCgPZFSa0fZBwDQpKipVaPsAwBoT6TU2lH2AQA0KWpq1Sj7AADaEym1dpR9\nAABNippaNco+AID2REqtHWUfAECToqZWjbIPAKA9kVJrR9kHANCe71d6/fq17AMAaE+UXNk6QkSX\n7AMAaFLOvhJVh8g+AIBWyT4AgKsg+wAAroLsAwC4CrIPAOAqbGVfZNVcOSf7AADaNf+2ryu9Ub5s\nOCX7AACatPhD3q73knxNOSr7AADatevP9s2bL8g+AIBW7fkrHfPjsg8AoFV7sm9O9gEAtEr2AQBc\nBdkHAHAVIvsipY73zNlXnhoAgJUi49Y6kH0AAFyMEnm9MfsAALhgsg8A4CrIPgCAqyD7AACuguwD\nALgCv/nN/wPm+4OUlXtGNAAAAABJRU5ErkJggg==\n", + "prompt_number": 15, + "text": [ + "" + ] + } + ], + "prompt_number": 15 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "On importe le module [pyensae](http://www.xavierdupre.fr/app/pyensae/helpsphinx/index.html) et on v\u00e9rifie que la version est sup\u00e9rieure \u00e0 0.8." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "import pyensae\n", + "pyensae.__version__" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "metadata": {}, + "output_type": "pyout", + "prompt_number": 10, + "text": [ + "'0.8'" + ] + } + ], + "prompt_number": 10 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "On ajoute les commandes magiques aux notebooks." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "pyensae.register_magics()" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 11 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "On ouvre la connection SSH qui restera ouverte jusqu'\u00e0 ce qu'on la ferme." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%remote_open" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "metadata": {}, + "output_type": "pyout", + "prompt_number": 18, + "text": [ + "" + ] + } + ], + "prompt_number": 18 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "On regarde le contenu du r\u00e9pertoire qui vous est associ\u00e9 sur la machine distante :" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%remote_cmd ls -l" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
total 8\n",
+        "-rw-rw-r-- 1 xavierdupre xavierdupre 242 Oct 22 00:51 ex.py\n",
+        "-rw-rw-r-- 1 xavierdupre xavierdupre 634 Aug 29 00:53 first_script.pig\n",
+        "
" + ], + "metadata": {}, + "output_type": "pyout", + "prompt_number": 5, + "text": [ + "" + ] + } + ], + "prompt_number": 5 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "C'est une commande linux. Les commandes les plus fr\u00e9quentes sont accessibles d\u00e9crites \u00e0 [Les commandes de base en console](http://doc.ubuntu-fr.org/tutoriel/console_commandes_de_base). L'instruction suivante consiste \u00e0 uploader un fichier depuis l'ordinateur local vers la passerelle." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%remote_up ConfLongDemo_JSI.small.txt ConfLongDemo_JSI.small.example.txt" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 6 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "On v\u00e9rifie que celui-ci a bien \u00e9t\u00e9 transf\u00e9r\u00e9 :" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%remote_cmd ls -l" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
total 140\n",
+        "-rw-rw-r-- 1 xavierdupre xavierdupre 132727 Oct 22 02:01 ConfLongDemo_JSI.small.example.txt\n",
+        "-rw-rw-r-- 1 xavierdupre xavierdupre    242 Oct 22 00:51 ex.py\n",
+        "-rw-rw-r-- 1 xavierdupre xavierdupre    634 Aug 29 00:53 first_script.pig\n",
+        "
" + ], + "metadata": {}, + "output_type": "pyout", + "prompt_number": 7, + "text": [ + "" + ] + } + ], + "prompt_number": 7 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Ensuite, on regarde le contenu du r\u00e9pertoire qui vous esrt associ\u00e9 sur le cluster :" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%remote_cmd hdfs dfs -ls" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
Found 4 items\n",
+        "drwx------   - xavierdupre xavierdupre          0 2014-08-29 00:49 .staging\n",
+        "drwxr-xr-x   - xavierdupre xavierdupre          0 2014-08-29 00:49 oozie-oozi\n",
+        "drwxr-xr-x   - xavierdupre xavierdupre          0 2014-08-29 00:19 stations_count_2013-05-24.paris.short.pig.2.txt\n",
+        "drwxr-xr-x   - xavierdupre xavierdupre          0 2014-08-29 00:49 test\n",
+        "
" + ], + "metadata": {}, + "output_type": "pyout", + "prompt_number": 8, + "text": [ + "" + ] + } + ], + "prompt_number": 8 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Les commandes HDFS d\u00e9crite \u00e0 [Apache Hadoop 2.3.0](http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/FileSystemShell.html). Elles sont tr\u00e8s proches des commandes linux. Ensuite, on uploade le fichier sur le syst\u00e8me de fichier distribu\u00e9 du cluster (HDFS) :" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%remote_cmd hdfs dfs -put ConfLongDemo_JSI.small.example.txt ./ConfLongDemo_JSI.small.example.txt" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
"
+       ],
+       "metadata": {},
+       "output_type": "pyout",
+       "prompt_number": 12,
+       "text": [
+        ""
+       ]
+      }
+     ],
+     "prompt_number": 12
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Puis on v\u00e9rifie que le fichier a bien \u00e9t\u00e9 upload\u00e9 sur le cluster :"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "%remote_cmd hdfs dfs -ls"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "html": [
+        "
Found 5 items\n",
+        "drwx------   - xavierdupre xavierdupre          0 2014-08-29 00:49 .staging\n",
+        "-rw-r--r--   3 xavierdupre xavierdupre     132727 2014-10-22 02:03 ConfLongDemo_JSI.small.example.txt\n",
+        "drwxr-xr-x   - xavierdupre xavierdupre          0 2014-08-29 00:49 oozie-oozi\n",
+        "drwxr-xr-x   - xavierdupre xavierdupre          0 2014-08-29 00:19 stations_count_2013-05-24.paris.short.pig.2.txt\n",
+        "drwxr-xr-x   - xavierdupre xavierdupre          0 2014-08-29 00:49 test\n",
+        "
" + ], + "metadata": {}, + "output_type": "pyout", + "prompt_number": 13, + "text": [ + "" + ] + } + ], + "prompt_number": 13 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "On regarde la fin du fichier sur le cluster :" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%remote_cmd hdfs dfs -tail ConfLongDemo_JSI.small.example.txt" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
1,633790226377438480,27.05.2009 14:03:57:743,4.371500492095946,1.4781558513641355,0.5384233593940735,lying\r\n",
+        "993,A01,010-000-024-033,633790226377708776,27.05.2009 14:03:57:770,3.0621800422668457,1.0790562629699707,0.6795752048492432,lying\r\n",
+        "994,A01,020-000-032-221,633790226378519655,27.05.2009 14:03:57:853,4.36382532119751,1.4307395219802856,0.3206148743629456,lying\r\n",
+        "995,A01,010-000-024-033,633790226378789954,27.05.2009 14:03:57:880,3.0784008502960205,1.0197675228118896,0.6061218976974487,lying\r\n",
+        "996,A01,010-000-030-096,633790226379060251,27.05.2009 14:03:57:907,3.182415008544922,1.1020996570587158,0.29104289412498474,lying\r\n",
+        "997,A01,020-000-033-111,633790226379330550,27.05.2009 14:03:57:933,4.7574005126953125,1.285519003868103,-0.08946932852268219,lying\r\n",
+        "998,A01,020-000-032-221,633790226379600847,27.05.2009 14:03:57:960,4.3730292320251465,1.3821170330047607,0.38861045241355896,lying\r\n",
+        "999,A01,010-000-024-033,633790226379871138,27.05.2009 14:03:57:987,3.198556661605835,1.1257659196853638,0.3567752242088318,lying\r\n",
+        "
" + ], + "metadata": {}, + "output_type": "pyout", + "prompt_number": 16, + "text": [ + "" + ] + } + ], + "prompt_number": 16 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Le fichier va suivre maintenant le chemin inverse. On le rapatrie depuis le cluster jusqu'\u00e0 l'ordinateur local. Premi\u00e8re \u00e9tape : du cluster \u00e0 la passerelle :" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%remote_cmd hdfs dfs -get ConfLongDemo_JSI.small.example.txt ConfLongDemo_JSI.small.example.is_back.txt" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
"
+       ],
+       "metadata": {},
+       "output_type": "pyout",
+       "prompt_number": 19,
+       "text": [
+        ""
+       ]
+      }
+     ],
+     "prompt_number": 19
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "On v\u00e9rifie que le fichier est sur la passerelle :"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "%remote_cmd ls -l"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "html": [
+        "
total 272\n",
+        "-rw-r--r-- 1 xavierdupre xavierdupre 132727 Oct 22 02:23 ConfLongDemo_JSI.small.example.is_back.txt\n",
+        "-rw-rw-r-- 1 xavierdupre xavierdupre 132727 Oct 22 02:01 ConfLongDemo_JSI.small.example.txt\n",
+        "-rw-rw-r-- 1 xavierdupre xavierdupre    242 Oct 22 00:51 ex.py\n",
+        "-rw-rw-r-- 1 xavierdupre xavierdupre    634 Aug 29 00:53 first_script.pig\n",
+        "
" + ], + "metadata": {}, + "output_type": "pyout", + "prompt_number": 20, + "text": [ + "" + ] + } + ], + "prompt_number": 20 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Second transfert depuis la passerelle jusqu'\u00e0 l'ordinateur local :" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%remote_down ConfLongDemo_JSI.small.example.is_back.txt ConfLongDemo_JSI.small.example.is_back_local.txt" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 22 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "import os\n", + "os.listdir(\".\")" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "metadata": {}, + "output_type": "pyout", + "prompt_number": 23, + "text": [ + "['cluster1.png',\n", + " 'ConfLongDemo_JSI.db3',\n", + " 'ConfLongDemo_JSI.small.example.is_back_local.txt',\n", + " 'ConfLongDemo_JSI.small.txt',\n", + " 'ConfLongDemo_JSI.txt',\n", + " 'hdfs1.png',\n", + " 'hdfspath.png',\n", + " 'servercred.png',\n", + " 'td3a_cenonce_session1.ipynb',\n", + " 'td3a_cenonce_session6.ipynb']" + ] + } + ], + "prompt_number": 23 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%remote_close" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 24 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "

Partie 2 : premier job map/reduce avec PIG

" ] }, {