From 4720736efbefa67c20cfc15f4633d30867f204a4 Mon Sep 17 00:00:00 2001 From: fafilia Date: Fri, 16 Oct 2020 17:08:01 +0700 Subject: [PATCH 1/2] delete last versio.n of 00. General Question.ipynb --- 00. General Question.ipynb | 1102 ------------------------------------ 1 file changed, 1102 deletions(-) delete mode 100644 00. General Question.ipynb diff --git a/00. General Question.ipynb b/00. General Question.ipynb deleted file mode 100644 index 8770e39..0000000 --- a/00. General Question.ipynb +++ /dev/null @@ -1,1102 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# General Question" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Bagaimana cara mengubah format *scientific* menjadi format *float* pada data numerik?" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Format *scientific* dapat diubah menggunakan `set_option()` pada `pandas`. Berikut adalah *syntax* lengkapnya :" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:45.379212Z", - "start_time": "2020-10-15T07:47:45.364252Z" - } - }, - "outputs": [], - "source": [ - "import pandas as pd\n", - "pd.set_option('display.float_format', lambda x: '%.3f' % x)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Apakah terdapat link *open source* yang menyediakan data untuk latihan mandiri?" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Jika ingin mendownload data untuk latihan secara mandiri, Anda dapat mengunjungi https://www.kaggle.com/datasets. Didalamnya terdapat berbagai macam dataset dan juga contoh penggunaaan serta analisisnya." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Syntax apakah yang digunakan untuk merubah format dari int64 sehingga tampilan 1000000 berubah menjadi 1,000,000 ?" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Untuk merubah tampilan *output* dengan pemisah koma pada angka, dapat menggunakan attribut `display.float_format` pada `pandas`. Berikut adalah syntax lengkapnya :" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:45.395201Z", - "start_time": "2020-10-15T07:47:45.380210Z" - } - }, - "outputs": [], - "source": [ - "pd.options.display.float_format = '{:,}'.format" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Bagaimana cara utk mengganti nama kolom?" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Mengubah nama kolom dapat menggunakan *method* `rename` sebagai berikut : \n", - "\n", - "```\n", - "df.rename(columns={\"to_replace\":\"new_replace\"})\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Bagaimana cara untuk menambah baris (row) pada data?" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Salah satu cara yang bisa digunakan untuk menambahkan *row* adalah dengan menggunakan *method* `concat()`. Pada dasarnya kita harus membuat terlebih dahulu *row* yang akan ditambahkan dalam bentuk *dataframe*, kemudian gabungkan data baru dengan *dataframe* yang telah ada dengan *method* `concat()` by row. " - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:45.411158Z", - "start_time": "2020-10-15T07:47:45.396167Z" - } - }, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
col1col2
013
124
\n", - "
" - ], - "text/plain": [ - " col1 col2\n", - "0 1 3\n", - "1 2 4" - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "d = {'col1': [1, 2], 'col2': [3, 4]}\n", - "df = pd.DataFrame(data=d)\n", - "df" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:45.427085Z", - "start_time": "2020-10-15T07:47:45.412124Z" - } - }, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
col1col2
057
168
\n", - "
" - ], - "text/plain": [ - " col1 col2\n", - "0 5 7\n", - "1 6 8" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "e = {'col1': [5, 6], 'col2': [7, 8]}\n", - "df2 = pd.DataFrame(data=e)\n", - "df2" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Menambahkan baris pada `df2` ke `df1` :" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:45.443043Z", - "start_time": "2020-10-15T07:47:45.428082Z" - } - }, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
col1col2
013
124
257
368
\n", - "
" - ], - "text/plain": [ - " col1 col2\n", - "0 1 3\n", - "1 2 4\n", - "2 5 7\n", - "3 6 8" - ] - }, - "execution_count": 15, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "df = pd.concat([df, df2], axis=0, ignore_index=True)\n", - "df" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Bagaimana cara menghapus kolom secara permanen?" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Menghapus kolom secara permanen dapat menggunakan *syntax* `drop`" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:45.459000Z", - "start_time": "2020-10-15T07:47:45.444039Z" - } - }, - "outputs": [], - "source": [ - "df.drop(columns=['col2'], inplace=True)" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:45.474957Z", - "start_time": "2020-10-15T07:47:45.460996Z" - } - }, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
col1
01
12
25
36
\n", - "
" - ], - "text/plain": [ - " col1\n", - "0 1\n", - "1 2\n", - "2 5\n", - "3 6" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "df" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Bagaimana cara memisahkan satu file CSV menjadi beberapa file CSV?\n", - "\n", - "Kasus: Dataset `kiva` berisi transaksi pinjaman dari awal tahun 2014 sampai akhir tahun 2015. Kita ingin memisahkan data tersebut berdasarkan periode (tahun-bulan) dari `posted_time` pinjaman.\n", - "\n", - "Maka dari itu, kita ekstrak terlebih dahulu informasi `year_month` yang dibutuhkan." - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:46.309475Z", - "start_time": "2020-10-15T07:47:45.476953Z" - } - }, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
year_monthidfunded_amountloan_amountactivitysectorcountryregioncurrencypartner_idposted_timefunded_timeterm_in_monthslender_countrepayment_interval
02014-01653051300.0300.0Fruits & VegetablesFoodPakistanLahorePKR2472014-01-01 06:12:392014-01-02 10:06:321212irregular
12014-01653053575.0575.0RickshawTransportationPakistanLahorePKR2472014-01-01 06:51:082014-01-02 09:17:231114irregular
22014-01653068150.0150.0TransportationTransportationIndiaMaynaguriINR3342014-01-01 09:58:072014-01-01 16:01:36436bullet
32014-01653063200.0200.0EmbroideryArtsPakistanLahorePKR2472014-01-01 08:03:112014-01-01 13:00:00118irregular
42014-01653084400.0400.0Milk SalesFoodPakistanAbdul HakeemPKR2452014-01-01 11:53:192014-01-01 19:18:511416monthly
\n", - "
" - ], - "text/plain": [ - " year_month id funded_amount loan_amount activity \\\n", - "0 2014-01 653051 300.0 300.0 Fruits & Vegetables \n", - "1 2014-01 653053 575.0 575.0 Rickshaw \n", - "2 2014-01 653068 150.0 150.0 Transportation \n", - "3 2014-01 653063 200.0 200.0 Embroidery \n", - "4 2014-01 653084 400.0 400.0 Milk Sales \n", - "\n", - " sector country region currency partner_id \\\n", - "0 Food Pakistan Lahore PKR 247 \n", - "1 Transportation Pakistan Lahore PKR 247 \n", - "2 Transportation India Maynaguri INR 334 \n", - "3 Arts Pakistan Lahore PKR 247 \n", - "4 Food Pakistan Abdul Hakeem PKR 245 \n", - "\n", - " posted_time funded_time term_in_months lender_count \\\n", - "0 2014-01-01 06:12:39 2014-01-02 10:06:32 12 12 \n", - "1 2014-01-01 06:51:08 2014-01-02 09:17:23 11 14 \n", - "2 2014-01-01 09:58:07 2014-01-01 16:01:36 43 6 \n", - "3 2014-01-01 08:03:11 2014-01-01 13:00:00 11 8 \n", - "4 2014-01-01 11:53:19 2014-01-01 19:18:51 14 16 \n", - "\n", - " repayment_interval \n", - "0 irregular \n", - "1 irregular \n", - "2 bullet \n", - "3 irregular \n", - "4 monthly " - ] - }, - "execution_count": 18, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import pandas as pd\n", - "kiva = pd.read_csv(\"data_input/kiva.csv\")\n", - "kiva.insert(loc=0, \n", - " column='year_month',\n", - " value=kiva['posted_time'].astype('datetime64').dt.to_period('M'))\n", - "kiva.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Kemudian kita sediakan satu folder tempat menampung pecahan file CSV ke dalam `FOLDERPATH`:" - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:46.325366Z", - "start_time": "2020-10-15T07:47:46.310477Z" - } - }, - "outputs": [], - "source": [ - "import os\n", - "\n", - "FOLDERPATH = \"data_input/kiva/\"\n", - "\n", - "if not os.path.exists(FOLDERPATH):\n", - " os.makedirs(FOLDERPATH)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Secara iteratif, lakukan conditional subsetting untuk DataFrame `kiva` berdasarkan masing-masing `year_month`. Hasil subset tersebut disimpan menggunakan method `.to_csv()` tanpa menggunakan nomor index." - ] - }, - { - "cell_type": "code", - "execution_count": 20, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:49.319842Z", - "start_time": "2020-10-15T07:47:46.326113Z" - } - }, - "outputs": [], - "source": [ - "for period in kiva['year_month'].unique():\n", - " kiva_subset = kiva[kiva['year_month'] == period]\n", - " \n", - " filename = f\"kiva-{period}.csv\"\n", - " \n", - " kiva_subset.to_csv(FOLDERPATH + filename, index=False)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Silahkan cek `FOLDERPATH`, seharusnya `kiva` sudah berhasil kita pisahkan menjadi 24 file CSV seperti gambar berikut:\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Bagaimana cara menggabungkan beberapa file CSV menjadi satu file CSV?\n", - "\n", - "Kasus: Kita memiliki 24 file CSV dataset `kiva` yang dipisahkan berdasarkan periode (tahun-bulan) seperti pada pertanyaan sebelumnya. Kita diminta untuk menggabungkannya menjadi satu file CSV saja untuk kebutuhan analisis.\n", - "\n", - "Maka dari itu, kita perlu tahu semua nama file CSV yang akan kita gabungkan menjadi satu. Caranya, gunakan method `glob()` kemudian kita spesifikan pola nama file yang ingin diambil. Penggunaan `*.csv` menandakan kita akan mengambil semua nama file dengan ekstensi csv." - ] - }, - { - "cell_type": "code", - "execution_count": 21, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:49.335064Z", - "start_time": "2020-10-15T07:47:49.320929Z" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "['data_input/kiva\\\\kiva-2014-01.csv',\n", - " 'data_input/kiva\\\\kiva-2014-02.csv',\n", - " 'data_input/kiva\\\\kiva-2014-03.csv',\n", - " 'data_input/kiva\\\\kiva-2014-04.csv',\n", - " 'data_input/kiva\\\\kiva-2014-05.csv',\n", - " 'data_input/kiva\\\\kiva-2014-06.csv',\n", - " 'data_input/kiva\\\\kiva-2014-07.csv',\n", - " 'data_input/kiva\\\\kiva-2014-08.csv',\n", - " 'data_input/kiva\\\\kiva-2014-09.csv',\n", - " 'data_input/kiva\\\\kiva-2014-10.csv',\n", - " 'data_input/kiva\\\\kiva-2014-11.csv',\n", - " 'data_input/kiva\\\\kiva-2014-12.csv',\n", - " 'data_input/kiva\\\\kiva-2015-01.csv',\n", - " 'data_input/kiva\\\\kiva-2015-02.csv',\n", - " 'data_input/kiva\\\\kiva-2015-03.csv',\n", - " 'data_input/kiva\\\\kiva-2015-04.csv',\n", - " 'data_input/kiva\\\\kiva-2015-05.csv',\n", - " 'data_input/kiva\\\\kiva-2015-06.csv',\n", - " 'data_input/kiva\\\\kiva-2015-07.csv',\n", - " 'data_input/kiva\\\\kiva-2015-08.csv',\n", - " 'data_input/kiva\\\\kiva-2015-09.csv',\n", - " 'data_input/kiva\\\\kiva-2015-10.csv',\n", - " 'data_input/kiva\\\\kiva-2015-11.csv',\n", - " 'data_input/kiva\\\\kiva-2015-12.csv']" - ] - }, - "execution_count": 21, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from glob import glob\n", - "\n", - "FOLDERPATH = \"data_input/kiva/\"\n", - "\n", - "filenames = glob(FOLDERPATH + '*.csv')\n", - "filenames" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Secara iteratif, baca file CSV menggunakan method `.read_csv()` kemudian simpan DataFrame ke dalam sebuah list." - ] - }, - { - "cell_type": "code", - "execution_count": 22, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:50.039671Z", - "start_time": "2020-10-15T07:47:49.335876Z" - } - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "24\n" - ] - } - ], - "source": [ - "df_list = []\n", - "for filename in filenames:\n", - " df = pd.read_csv(filename)\n", - " df_list.append(df)\n", - "print(len(df_list))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Dengan menggunakan method `.concat()`, semua DataFrame pada `df_list` akan digabungkan menjadi satu berdasarkan baris, dengan syarat semua nama kolom harus sama. Method `.reset_index()` digunakan agar penomoran index diulang dari 0 sampai banyaknya baris pada DataFrame." - ] - }, - { - "cell_type": "code", - "execution_count": 23, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:50.151003Z", - "start_time": "2020-10-15T07:47:50.040676Z" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "(323279, 15)" - ] - }, - "execution_count": 23, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "kiva_concat = pd.concat(df_list)\n", - "kiva_concat = kiva_concat.reset_index(drop=True)\n", - "kiva_concat.shape" - ] - }, - { - "cell_type": "code", - "execution_count": 24, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:50.166995Z", - "start_time": "2020-10-15T07:47:50.153001Z" - } - }, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
year_monthidfunded_amountloan_amountactivitysectorcountryregioncurrencypartner_idposted_timefunded_timeterm_in_monthslender_countrepayment_interval
02014-01653051300.0300.0Fruits & VegetablesFoodPakistanLahorePKR2472014-01-01 06:12:392014-01-02 10:06:321212irregular
12014-01653053575.0575.0RickshawTransportationPakistanLahorePKR2472014-01-01 06:51:082014-01-02 09:17:231114irregular
22014-01653068150.0150.0TransportationTransportationIndiaMaynaguriINR3342014-01-01 09:58:072014-01-01 16:01:36436bullet
32014-01653063200.0200.0EmbroideryArtsPakistanLahorePKR2472014-01-01 08:03:112014-01-01 13:00:00118irregular
42014-01653084400.0400.0Milk SalesFoodPakistanAbdul HakeemPKR2452014-01-01 11:53:192014-01-01 19:18:511416monthly
\n", - "
" - ], - "text/plain": [ - " year_month id funded_amount loan_amount activity \\\n", - "0 2014-01 653051 300.0 300.0 Fruits & Vegetables \n", - "1 2014-01 653053 575.0 575.0 Rickshaw \n", - "2 2014-01 653068 150.0 150.0 Transportation \n", - "3 2014-01 653063 200.0 200.0 Embroidery \n", - "4 2014-01 653084 400.0 400.0 Milk Sales \n", - "\n", - " sector country region currency partner_id \\\n", - "0 Food Pakistan Lahore PKR 247 \n", - "1 Transportation Pakistan Lahore PKR 247 \n", - "2 Transportation India Maynaguri INR 334 \n", - "3 Arts Pakistan Lahore PKR 247 \n", - "4 Food Pakistan Abdul Hakeem PKR 245 \n", - "\n", - " posted_time funded_time term_in_months lender_count \\\n", - "0 2014-01-01 06:12:39 2014-01-02 10:06:32 12 12 \n", - "1 2014-01-01 06:51:08 2014-01-02 09:17:23 11 14 \n", - "2 2014-01-01 09:58:07 2014-01-01 16:01:36 43 6 \n", - "3 2014-01-01 08:03:11 2014-01-01 13:00:00 11 8 \n", - "4 2014-01-01 11:53:19 2014-01-01 19:18:51 14 16 \n", - "\n", - " repayment_interval \n", - "0 irregular \n", - "1 irregular \n", - "2 bullet \n", - "3 irregular \n", - "4 monthly " - ] - }, - "execution_count": 24, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "kiva_concat.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Simpan objek `kiva_concat` ke dalam satu file CSV: " - ] - }, - { - "cell_type": "code", - "execution_count": 25, - "metadata": { - "ExecuteTime": { - "end_time": "2020-10-15T07:47:51.933015Z", - "start_time": "2020-10-15T07:47:50.167961Z" - } - }, - "outputs": [], - "source": [ - "kiva_concat.to_csv(\"data_input/kiva_concat.csv\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Bagaimana cara membuat *table of content* pada jupyter notebook?" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Untuk menampilkan *table of content* (TOC), pastikan didalam jupyter notebook sudah terinstall `nbextension`. Jika sudah terinstall buka config `nbextension` kemudian check pilihan `Table of Content`.\n", - "\n", - "Apabila belum terinstall `nbextension`, maka ikuti langkah pada poin di bawah ini untuk menginstall `nbextension`." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Langkah menginstall `nbextension`!" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "1. Install contrib nbextension\n", - "```\n", - "conda install -c conda-forge jupyter_contrib_nbextensions \n", - "```\n", - "2. Install configurator\n", - "```\n", - "conda install -c conda-forge jupyter_nbextensions_configurator\n", - "```\n", - "\n", - "3. Mengaktifkan nbextension pada jupyter notebook \n", - "```\n", - "jupyter nbextensions_configurator enable --user\n", - "```" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "dataanalytics", - "language": "python", - "name": "dataanalytics" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.5" - }, - "toc": { - "base_numbering": 1, - "nav_menu": {}, - "number_sections": true, - "sideBar": true, - "skip_h1_title": false, - "title_cell": "Table of Contents", - "title_sidebar": "Contents", - "toc_cell": false, - "toc_position": {}, - "toc_section_display": true, - "toc_window_display": false - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} From 4cdb40322afa9beb85f0bbf7a23c1347f9843e8f Mon Sep 17 00:00:00 2001 From: fafilia Date: Fri, 16 Oct 2020 17:08:59 +0700 Subject: [PATCH 2/2] Proofread 00. General Question.ipynb --- 00. General Question.ipynb | 1109 ++++++++++++++++++++++++++++++++++++ 1 file changed, 1109 insertions(+) create mode 100644 00. General Question.ipynb diff --git a/00. General Question.ipynb b/00. General Question.ipynb new file mode 100644 index 0000000..1f01272 --- /dev/null +++ b/00. General Question.ipynb @@ -0,0 +1,1109 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# General Question" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Bagaimana cara mengubah format *scientific* menjadi format *float* pada data numerik?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Format *scientific* dapat diubah menggunakan `set_option()` pada `pandas`. Berikut adalah *syntax* lengkapnya :" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:45.379212Z", + "start_time": "2020-10-15T07:47:45.364252Z" + } + }, + "outputs": [], + "source": [ + "import pandas as pd\n", + "pd.set_option('display.float_format', lambda x: '%.3f' % x)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Anda dapat mengaplikasikan penggunaan nilai desimal setelah deklarasi fungsi `lambda x: `. Pada contoh di atas, kita mengapilkasikan 3 satuan angka desimal setelah tanda pemisah titik pada nilai float." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Apakah terdapat link *open source* yang menyediakan data untuk latihan mandiri?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Jika ingin mendownload data untuk latihan secara mandiri, Anda dapat mengunjungi https://www.kaggle.com/datasets. Didalamnya terdapat berbagai macam dataset dan juga contoh penggunaaan serta analisisnya." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Syntax apakah yang digunakan untuk mengubah format dari int64 sehingga tampilan 1000000 berubah menjadi 1,000,000 ?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Untuk mengubah tampilan *output* dengan pemisah koma pada angka, dapat menggunakan attribut `display.float_format` pada `pandas`. Berikut adalah syntax lengkapnya :" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:45.395201Z", + "start_time": "2020-10-15T07:47:45.380210Z" + } + }, + "outputs": [], + "source": [ + "pd.options.display.float_format = '{:,}'.format" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Bagaimana cara utk mengganti nama kolom?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Mengubah nama kolom dapat menggunakan *method* `rename` sebagai berikut : \n", + "\n", + "```\n", + "df.rename(columns={\"to_replace\":\"new_replace\"})\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Bagaimana cara untuk menambah baris (row) pada data?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Salah satu cara yang bisa digunakan untuk menambahkan *row* adalah dengan menggunakan *method* `concat()`. Pada dasarnya kita harus membuat terlebih dahulu *row* yang akan ditambahkan dalam bentuk *dataframe*, kemudian gabungkan data baru dengan *dataframe* yang telah ada dengan *method* `concat()` by row. " + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:45.411158Z", + "start_time": "2020-10-15T07:47:45.396167Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
col1col2
013
124
\n", + "
" + ], + "text/plain": [ + " col1 col2\n", + "0 1 3\n", + "1 2 4" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "d = {'col1': [1, 2], 'col2': [3, 4]}\n", + "df = pd.DataFrame(data=d)\n", + "df" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:45.427085Z", + "start_time": "2020-10-15T07:47:45.412124Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
col1col2
057
168
\n", + "
" + ], + "text/plain": [ + " col1 col2\n", + "0 5 7\n", + "1 6 8" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "e = {'col1': [5, 6], 'col2': [7, 8]}\n", + "df2 = pd.DataFrame(data=e)\n", + "df2" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Menambahkan baris pada `df2` ke `df1` :" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:45.443043Z", + "start_time": "2020-10-15T07:47:45.428082Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
col1col2
013
124
257
368
\n", + "
" + ], + "text/plain": [ + " col1 col2\n", + "0 1 3\n", + "1 2 4\n", + "2 5 7\n", + "3 6 8" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df = pd.concat([df, df2], axis=0, ignore_index=True)\n", + "df" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Bagaimana cara menghapus kolom secara permanen?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Menghapus kolom secara permanen dapat menggunakan *syntax* `drop`" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:45.459000Z", + "start_time": "2020-10-15T07:47:45.444039Z" + } + }, + "outputs": [], + "source": [ + "df.drop(columns=['col2'], inplace=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:45.474957Z", + "start_time": "2020-10-15T07:47:45.460996Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
col1
01
12
25
36
\n", + "
" + ], + "text/plain": [ + " col1\n", + "0 1\n", + "1 2\n", + "2 5\n", + "3 6" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Bagaimana cara memisahkan satu file CSV menjadi beberapa file CSV?\n", + "\n", + "Kasus: Dataset `kiva` berisi transaksi pinjaman dari awal tahun 2014 sampai akhir tahun 2015. Kita ingin memisahkan data tersebut berdasarkan periode (tahun-bulan) dari `posted_time` pinjaman.\n", + "\n", + "Maka dari itu, kita ekstrak terlebih dahulu informasi `year_month` yang dibutuhkan." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:46.309475Z", + "start_time": "2020-10-15T07:47:45.476953Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
year_monthidfunded_amountloan_amountactivitysectorcountryregioncurrencypartner_idposted_timefunded_timeterm_in_monthslender_countrepayment_interval
02014-01653051300.0300.0Fruits & VegetablesFoodPakistanLahorePKR2472014-01-01 06:12:392014-01-02 10:06:321212irregular
12014-01653053575.0575.0RickshawTransportationPakistanLahorePKR2472014-01-01 06:51:082014-01-02 09:17:231114irregular
22014-01653068150.0150.0TransportationTransportationIndiaMaynaguriINR3342014-01-01 09:58:072014-01-01 16:01:36436bullet
32014-01653063200.0200.0EmbroideryArtsPakistanLahorePKR2472014-01-01 08:03:112014-01-01 13:00:00118irregular
42014-01653084400.0400.0Milk SalesFoodPakistanAbdul HakeemPKR2452014-01-01 11:53:192014-01-01 19:18:511416monthly
\n", + "
" + ], + "text/plain": [ + " year_month id funded_amount loan_amount activity \\\n", + "0 2014-01 653051 300.0 300.0 Fruits & Vegetables \n", + "1 2014-01 653053 575.0 575.0 Rickshaw \n", + "2 2014-01 653068 150.0 150.0 Transportation \n", + "3 2014-01 653063 200.0 200.0 Embroidery \n", + "4 2014-01 653084 400.0 400.0 Milk Sales \n", + "\n", + " sector country region currency partner_id \\\n", + "0 Food Pakistan Lahore PKR 247 \n", + "1 Transportation Pakistan Lahore PKR 247 \n", + "2 Transportation India Maynaguri INR 334 \n", + "3 Arts Pakistan Lahore PKR 247 \n", + "4 Food Pakistan Abdul Hakeem PKR 245 \n", + "\n", + " posted_time funded_time term_in_months lender_count \\\n", + "0 2014-01-01 06:12:39 2014-01-02 10:06:32 12 12 \n", + "1 2014-01-01 06:51:08 2014-01-02 09:17:23 11 14 \n", + "2 2014-01-01 09:58:07 2014-01-01 16:01:36 43 6 \n", + "3 2014-01-01 08:03:11 2014-01-01 13:00:00 11 8 \n", + "4 2014-01-01 11:53:19 2014-01-01 19:18:51 14 16 \n", + "\n", + " repayment_interval \n", + "0 irregular \n", + "1 irregular \n", + "2 bullet \n", + "3 irregular \n", + "4 monthly " + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import pandas as pd\n", + "kiva = pd.read_csv(\"data_input/kiva.csv\")\n", + "kiva.insert(loc=0, \n", + " column='year_month',\n", + " value=kiva['posted_time'].astype('datetime64').dt.to_period('M'))\n", + "kiva.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Kemudian kita sediakan satu folder tempat menampung pecahan file CSV ke dalam `FOLDERPATH`:" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:46.325366Z", + "start_time": "2020-10-15T07:47:46.310477Z" + } + }, + "outputs": [], + "source": [ + "import os\n", + "\n", + "FOLDERPATH = \"data_input/kiva/\"\n", + "\n", + "if not os.path.exists(FOLDERPATH):\n", + " os.makedirs(FOLDERPATH)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Secara iteratif, lakukan conditional subsetting untuk DataFrame `kiva` berdasarkan masing-masing `year_month`. Hasil subset tersebut disimpan menggunakan method `.to_csv()` tanpa menggunakan nomor index." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:49.319842Z", + "start_time": "2020-10-15T07:47:46.326113Z" + } + }, + "outputs": [], + "source": [ + "for period in kiva['year_month'].unique():\n", + " kiva_subset = kiva[kiva['year_month'] == period]\n", + " \n", + " filename = f\"kiva-{period}.csv\"\n", + " \n", + " kiva_subset.to_csv(FOLDERPATH + filename, index=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Silahkan cek `FOLDERPATH`, seharusnya `kiva` sudah berhasil kita pisahkan menjadi 24 file CSV seperti gambar berikut:\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Bagaimana cara menggabungkan beberapa file CSV menjadi satu file CSV?\n", + "\n", + "Kasus: Kita memiliki 24 file CSV dataset `kiva` yang dipisahkan berdasarkan periode (tahun-bulan) seperti pada pertanyaan sebelumnya. Kita diminta untuk menggabungkannya menjadi satu file CSV saja untuk kebutuhan analisis.\n", + "\n", + "Maka dari itu, kita perlu tahu semua nama file CSV yang akan kita gabungkan menjadi satu. Caranya, gunakan method `glob()` kemudian kita spesifikan pola nama file yang ingin diambil. Penggunaan `*.csv` menandakan kita akan mengambil semua nama file dengan ekstensi csv." + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:49.335064Z", + "start_time": "2020-10-15T07:47:49.320929Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['data_input/kiva\\\\kiva-2014-01.csv',\n", + " 'data_input/kiva\\\\kiva-2014-02.csv',\n", + " 'data_input/kiva\\\\kiva-2014-03.csv',\n", + " 'data_input/kiva\\\\kiva-2014-04.csv',\n", + " 'data_input/kiva\\\\kiva-2014-05.csv',\n", + " 'data_input/kiva\\\\kiva-2014-06.csv',\n", + " 'data_input/kiva\\\\kiva-2014-07.csv',\n", + " 'data_input/kiva\\\\kiva-2014-08.csv',\n", + " 'data_input/kiva\\\\kiva-2014-09.csv',\n", + " 'data_input/kiva\\\\kiva-2014-10.csv',\n", + " 'data_input/kiva\\\\kiva-2014-11.csv',\n", + " 'data_input/kiva\\\\kiva-2014-12.csv',\n", + " 'data_input/kiva\\\\kiva-2015-01.csv',\n", + " 'data_input/kiva\\\\kiva-2015-02.csv',\n", + " 'data_input/kiva\\\\kiva-2015-03.csv',\n", + " 'data_input/kiva\\\\kiva-2015-04.csv',\n", + " 'data_input/kiva\\\\kiva-2015-05.csv',\n", + " 'data_input/kiva\\\\kiva-2015-06.csv',\n", + " 'data_input/kiva\\\\kiva-2015-07.csv',\n", + " 'data_input/kiva\\\\kiva-2015-08.csv',\n", + " 'data_input/kiva\\\\kiva-2015-09.csv',\n", + " 'data_input/kiva\\\\kiva-2015-10.csv',\n", + " 'data_input/kiva\\\\kiva-2015-11.csv',\n", + " 'data_input/kiva\\\\kiva-2015-12.csv']" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from glob import glob\n", + "\n", + "FOLDERPATH = \"data_input/kiva/\"\n", + "\n", + "filenames = glob(FOLDERPATH + '*.csv')\n", + "filenames" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Secara iteratif, baca file CSV menggunakan method `.read_csv()` kemudian simpan DataFrame ke dalam sebuah list." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:50.039671Z", + "start_time": "2020-10-15T07:47:49.335876Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "24\n" + ] + } + ], + "source": [ + "df_list = []\n", + "for filename in filenames:\n", + " df = pd.read_csv(filename)\n", + " df_list.append(df)\n", + "print(len(df_list))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Dengan menggunakan method `.concat()`, semua DataFrame pada `df_list` akan digabungkan menjadi satu berdasarkan baris, dengan syarat semua nama kolom harus sama. Method `.reset_index()` digunakan agar penomoran index diulang dari 0 sampai banyaknya baris pada DataFrame." + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:50.151003Z", + "start_time": "2020-10-15T07:47:50.040676Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(323279, 15)" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "kiva_concat = pd.concat(df_list)\n", + "kiva_concat = kiva_concat.reset_index(drop=True)\n", + "kiva_concat.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:50.166995Z", + "start_time": "2020-10-15T07:47:50.153001Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
year_monthidfunded_amountloan_amountactivitysectorcountryregioncurrencypartner_idposted_timefunded_timeterm_in_monthslender_countrepayment_interval
02014-01653051300.0300.0Fruits & VegetablesFoodPakistanLahorePKR2472014-01-01 06:12:392014-01-02 10:06:321212irregular
12014-01653053575.0575.0RickshawTransportationPakistanLahorePKR2472014-01-01 06:51:082014-01-02 09:17:231114irregular
22014-01653068150.0150.0TransportationTransportationIndiaMaynaguriINR3342014-01-01 09:58:072014-01-01 16:01:36436bullet
32014-01653063200.0200.0EmbroideryArtsPakistanLahorePKR2472014-01-01 08:03:112014-01-01 13:00:00118irregular
42014-01653084400.0400.0Milk SalesFoodPakistanAbdul HakeemPKR2452014-01-01 11:53:192014-01-01 19:18:511416monthly
\n", + "
" + ], + "text/plain": [ + " year_month id funded_amount loan_amount activity \\\n", + "0 2014-01 653051 300.0 300.0 Fruits & Vegetables \n", + "1 2014-01 653053 575.0 575.0 Rickshaw \n", + "2 2014-01 653068 150.0 150.0 Transportation \n", + "3 2014-01 653063 200.0 200.0 Embroidery \n", + "4 2014-01 653084 400.0 400.0 Milk Sales \n", + "\n", + " sector country region currency partner_id \\\n", + "0 Food Pakistan Lahore PKR 247 \n", + "1 Transportation Pakistan Lahore PKR 247 \n", + "2 Transportation India Maynaguri INR 334 \n", + "3 Arts Pakistan Lahore PKR 247 \n", + "4 Food Pakistan Abdul Hakeem PKR 245 \n", + "\n", + " posted_time funded_time term_in_months lender_count \\\n", + "0 2014-01-01 06:12:39 2014-01-02 10:06:32 12 12 \n", + "1 2014-01-01 06:51:08 2014-01-02 09:17:23 11 14 \n", + "2 2014-01-01 09:58:07 2014-01-01 16:01:36 43 6 \n", + "3 2014-01-01 08:03:11 2014-01-01 13:00:00 11 8 \n", + "4 2014-01-01 11:53:19 2014-01-01 19:18:51 14 16 \n", + "\n", + " repayment_interval \n", + "0 irregular \n", + "1 irregular \n", + "2 bullet \n", + "3 irregular \n", + "4 monthly " + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "kiva_concat.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Simpan objek `kiva_concat` ke dalam satu file CSV: " + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "ExecuteTime": { + "end_time": "2020-10-15T07:47:51.933015Z", + "start_time": "2020-10-15T07:47:50.167961Z" + } + }, + "outputs": [], + "source": [ + "kiva_concat.to_csv(\"data_input/kiva_concat.csv\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Bagaimana cara membuat *table of content* pada jupyter notebook?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Untuk menampilkan *table of content* (TOC), pastikan didalam jupyter notebook sudah terinstall `nbextension`. Jika sudah terinstall buka config `nbextension` kemudian check pilihan `Table of Content`.\n", + "\n", + "Apabila belum terinstall `nbextension`, maka ikuti langkah pada poin di bawah ini untuk menginstall `nbextension`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Langkah menginstall `nbextension`!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "1. Install contrib nbextension\n", + "```\n", + "conda install -c conda-forge jupyter_contrib_nbextensions \n", + "```\n", + "2. Install configurator\n", + "```\n", + "conda install -c conda-forge jupyter_nbextensions_configurator\n", + "```\n", + "\n", + "3. Mengaktifkan nbextension pada jupyter notebook \n", + "```\n", + "jupyter nbextensions_configurator enable --user\n", + "```" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "algoritma", + "language": "python", + "name": "algoritma" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.3" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": true, + "sideBar": true, + "skip_h1_title": false, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": {}, + "toc_section_display": true, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}